如何解决过拟合与欠拟合


过拟合

  • 添加其他特征项。组合、泛化、相关性、上下文特征、平台特征等特征是特征添加的重要手段,有时候特征项不够会导致模型欠拟合。
  • 添加多项式特征。例如将线性模型添加二次项或三次项使模型泛化能力更强。例如,FM(Factorization Machine)模型、FFM(Field-aware Factorization Machine)模型,其实就是线性模型,增加了二阶多项式,保证了模型一定的拟合程度。
  • 可以增加模型的复杂程度。
  • 减小正则化系数。正则化的目的是用来防止过拟合的,但是现在模型出现了欠拟合,则需要减少正则化参数。

欠拟合

  • 重新清洗数据,数据不纯会导致过拟合

Read more

Linux Basics - Introduction


Linux Basics - Introduction

This is a series of tutorials for commonly used Linux commands. All examples included are based on CentOS 7.5

$ lsb_release -a
LSB Version:    :core-4.1-amd64:core-4.1-noarch
Distributor ID: CentOS
Description:    CentOS Linux release 7.5.1804 (Core)
Release:    7.5.1804

Read more

Linux Basics - First Step


Linux Basics - First Step

Whenever you open a terminal and want to perform some kind of tasks, you start to type Linux commands. But there could be thousands of commands and even a much larger amount of options to go with these commands. How could you remember all of them? The answer is impossible.

Read more

Linux Basics - Daily Recipe


Linux Basics - Daily Recipe

Among an ocean of Linux commands, some are used much more frequently than others. Here is the listing of some of the most common ones:

  • pwd: Show your current working directory
  • cd: Change directory
  • ls: Show files in current working directory
  • cat: Show content of a file
  • mk

Read more