Main Page

From LQ's wiki
Revision as of 12:27, 4 October 2014 by Changtau2005 (Talk | contribs)

Jump to: navigation, search

Page currently under reconstruction (4th October 2014). Expected to finish in several hours. Please check back later :)


I'm Li, a 4th year student at University College London currently working on an MEng in Computer Science. I'm most interested in applications of machine learning to large data sets. I haven't decided on a specific research area, primarily because I don't think I've seen enough of the field yet. However, my current interests slant towards applying machine learning to areas related to data mining, semantic computation, and natural language processing. The data I've worked with in the past are web-based (AOL search logs, Bing session data, mined Twitter data, YAGO2).

Why computer science? At first, I decided to enter the field because I love building things. Stacks. Factories. Interfaces. Semaphores. Software are teeming cities running like clockwork on top of layers and layers of abstraction. I thought that I wanted to be a developer for sure, but then I began to see some really interesting problems and approaches to solving them in the field, so I focused my efforts on research too. Computer science (and AI / machine learning) is very much in the middle of interdisciplinary research, and I think this is where the most exciting things are happening. Actually, previously I was a medical student in Imperial College London - I left after two years - but that's a story for another time ;)

+ For people unfamiliar with computer science, machine learning really is just pattern recognition. If you can reduce a problem to a pattern recognition problem, then you can apply machine learning to solve it. It is a powerful technique that we can use to try and find features / trends / patterns hidden within huge amounts of data (DNA, stock ticks, the internet), or to classify that data into different categories (think algorithm that recognizes faces, road signs, or system intrusions based on anomalous behaviour patterns).


  • [ pdf | MediaWiki ] -- (UK version - 2 pages)
  • [ todo | todo ] -- (US version - 1 page)


Microsoft Research Cambridge




I was fortunate enough to have the opportunity to be involved in several short-term research projects (2 months - 6 months) during my undergraduate years. Generally, internship opportunities for undergraduate students in the UK tend to be limited to development work.

Big Five Personality Classification of Twitter Profile by Machine Learning

This is the title for my Masters dissertation. At the time of writing, I've just begun to work on it, so everything is still highly tentative. Supervisor: Emine Yilmaz. Personal tutor: Dr. Kevin Bryson

By mining the text corpus of individual Twitter profiles, we hope to classify the user in the five categories of the Big Five model. We plan to do so by identifying adjectives in them labeled with a "weight" towards one end of each category. Such labels can be found from the seminal Allport-Odbert 1936 list and in similar works. We are scoping the project to only consider Twitter profiles in English.

We hope that the findings form a basis for further research into identifying individuals with potential signs of depression based on their Twitter activity. Depending on the speed of progress, we might have some time to consider this part of the problem.


Task Identification using Search Engine Query Logs




Currently wiki is mostly used to construct and publish dynamic/modular documents since wikitext/HTML is easier to work with than LaTeX in some cases. MediaWiki also works as a convenient CMS for the dev diary.