About Me

Hi, my name is Yunong. I work for Centiment.io as a data scientist since Oct 2016 after graduating from NYU Center for Urban Science + Progress. I am interested and specialized in data cleaning, mungling, modeling and visualization over spatial-temporal data.


Techinical Skills:

  • Data Mining: Python, R, C++
  • Web Development: HTML, CSS, Bootstrap, d3.js, leaflet.js, Flask
  • Big Data: Hadoop, Spark
  • Cloud Computing: AWS EC2
  • GIS: ArcGIS
  • SQL: PostegreSQL, MySQL
  • Others: MATLAB, OpenRefine, AWS SQS, LaTeX, Microsoft Office Applications

Working Experience:

Research Assistant, NYU mHealth MapCorps

  • Assisted 7 research scientists and 4 engineers to conduct mobile Health project
  • Set up working environment (LAMP, GIT, R) in AWS EC2 instances for the team
  • Fetched data from AWS SQS queue and updated them to PostreSQL database
  • Translated the research volunteer recruiting program into Python from R
  • Stitched pictures to make panoramic photographs by Python and Shell scripts
  • Blurred human faces in panoramic photographs with OpenCV

Projects:

WiFind

  • Developed a system for collecting, analyzing and visualizing WiFi signal data with team of 6
  • Rendered raster map tiles of WiFi signals on server using Django framework
  • Detected blind spots of WiFi signal for Red Hook Initiative using our system
  • Designed and coded a mobile-friendly website to introduce the project and accommodate the map UI with Bootstrap, AngularJS and Leaflet.js

Twitter Primary

  • Made a web app for analyzing realtime tweets about 2016 poll candidates
  • Extracted keywords and sentiment of real-time tweets via Python NLTK
  • Croned analysis program and fed analysis result on Heroku using Flask
  • Connected and visualized raw data and analysis result on a dashboard by d3.js

Explore Empty Taxis Problem in NYC

  • Filtered all trip records of empty taxis from TLC Taxi Trip Data (20GB) by Spark
  • Estimated the routes of empty taxis’ trips through R-tree by Hadoop
  • Deduced the taxi zones with most empty taxis for different hours in a day and different days of a week
  • Visualized empty taxis’ distribution using choropleth map via CartoDB

Evaluate the Sentiment Intensity of Words Used on Yelp’s Review

  • Cleaned and munged Yelp’s business information and customer review data
  • Implemented and optimized Pegasos algorithm for sentiment analysis in Python
  • Designed an algorithm to evaluate the intensity level of popular words used to express sentiments for different types of restaurants
  • Concluded the most notable customer concerns for each type of restaurant based on the sentiment score of reviews

Other Experiences:

Process Engineer Internship, DuPont

  • Collaborated with 5 engineers in team to identify the risks in production process
  • Conducted calculations referring to mass transfer and fluid mechanics problems
  • Researched technical information on heat exchangers, flow meters and pumps
  • Studied DuPont standard operation process and SHE management
  • Edited the high risk activities managing process of the plant

Education Backgound:

New York University:

  • M.S. in Applied Urban Informatics

University at Buffalo, The State University of New York:

  • B.S. in Chemical Engineering
  • B.A. in Mathematics