point is purple lip: people, no heart sex sticks to look forward to so much. In the cold night thin rain, water smell charming language. The wind sentimental, thousands of miles alone looking back. After separation, tears sleeves, pity becomes still. & ndash; & ndash; A computer
a computer could write song lyrics? This is how to do it? Actually if you look carefully, you will find this book in each image often appear in & other; Authentic song ci & throughout; In the. Yes, in fact, it is based on the analysis of the whole song lyrics, break sentences into words, and summarizes the high frequency words of song ci again according to the format of ci poetry & other; Creation & throughout; And into.
apparently, very important of which is the first step, the reverse operation is called word segmentation, word segmentation method are many, and is widely studied, but it is not only limited to use on the automatic lyrics. What word segmentation method, what’s the use?
participle: how easy? Or hard?
in English, the word segmentation is a relatively simple task, because there is a natural delimiter between word and the word, you just need to take care of animals (such as apple/apples, bus/buses, woman/women), tenses, such as the write/demonstrate/writing) parts of speech, such as deformation, will have to refer to the same words together into the same unit. In addition to pay attention to the word, same spelling, such as lay (lie down/in/lay eggs), but on the whole between words or have obvious boundaries.
and is significantly different in Chinese, because Chinese is wonderful, the same sentence may have different words segmentation approach, such as & other; Table tennis/sale/finished/throughout the &; And & other Table tennis racket/sell/finished/throughout the &; , so the Chinese word segmentation is a difficult and complicated engineering.
however, there are also a simple way of violence and that’s all for the word exhaustive sentence combination, and then the overall frequency of statistics. This kind of method for ci poetry is more tricky, because the song itself short sentences, and word length is limited; For more general text, this kind of violence and dismantling does not apply, on the one hand is the amount of calculation is too large, on the other hand is low accuracy.
so, for general case, the word segmentation have what way? In this simple introduce to you two are easier to understand and the commonly used method.