## Wikipedia word frequency

March 28, 2017

Download a file that contains each word (with a length > 2) and its frequencies extract from the Wikipedia corpus in descending order. The format of the file is

(<freq>,<word>)

here is an excerpt of the first 10 entries

(134754521,the)
(54428880,and)
(23193612,was)
(15908522,for)
(13531284,with)
(10813887,that)
(9944531,from)
(9452065,his)
(6237633,were)
(5892049,are)


Tags: #misc