编辑

Self introduction
Student stage
At the graduate level, my major is basic mathematics.
At the doctoral level, my major is applied mathematics
my first job
my first job was working at newegg laboratory。
The laboratory's work is diverse, such as studying the source code of the search engine lucene, in order to add real-time indexing functions. OCR identification of prices scraped from competitors' websites
algorithm development work
I actually started algorithm development work at Tudou Company。
At that time, my KPI was to increase click-through rate by 10% every quarter。 At that time, the company’s recommendation algorithm was a collaborative filtering algorithm
I changed the recommendation algorithm to a probabilistic transition model.
The click-through rate of the recommendation algorithm has increased from about 20% to about 50%
The main idea of the algorithm
The collaborative filtering algorithm assumes that the user's interest is constant at all times.
My algorithm assumes that users’ interests change over time, similar to human desires. For example, if I drink too much water, I don’t want to drink anymore.
The idea of ​​my algorithm is this: since we assume that the user’s viewing interest changes over time, the user’s interest is stable only in a short period of time。
First, we filter out a video with a 90% viewing ratio of the user, and these videos are arranged in order. In this way, each user gets a path to watch the video. Note that the viewing ratio of the video here is 90%, and it is for one user. Next, we calculate the paths for all users to watch the video. In this way, we can calculate the transition probability of each video. For example, if the first video A is viewed 1000 times, after watching A, 500 people watch video B, then the probability of transition from a to b is 50%
We can also calculate different transition probabilities based on different viewing thresholds. In this way, we get a dozen functions, f1, f2,...f10. They just have different viewing ratios. We use the AB Test method to calculate the weights of these functions, and then add up these functions. This becomes the final recommendation function.
The click-through rate of this recommendation algorithm is much higher than before. Originally it was about 20%, the new algorithm is about 50%
I prefer to develop graphical programming tools。
For example, a development tool similar to a flowchart. Development tools similar to data flow. Development tool similar to program code
The commonly used recommendation algorithm framework is recall plus sorting
DSSM = Deep Structured Semantic Model