Hi, my name is Yunong. I work for Centiment.io as a data scientist since Oct 2016 after graduating from NYU Center for Urban Science + Progress. I am interested and specialized in data cleaning, mungling, modeling and visualization over spatial-temporal data.
Techinical Skills:
- Data Mining: Python, R, C++
- Web Development: HTML, CSS, Bootstrap, d3.js, leaflet.js, Flask
- Big Data: Hadoop, Spark
- Cloud Computing: AWS EC2
- GIS: ArcGIS
- SQL: PostegreSQL, MySQL
- Others: MATLAB, OpenRefine, AWS SQS, LaTeX, Microsoft Office Applications
Working Experience:
Research Assistant, NYU mHealth MapCorps
- Assisted 7 research scientists and 4 engineers to conduct mobile Health project
- Set up working environment (LAMP, GIT, R) in AWS EC2 instances for the team
- Fetched data from AWS SQS queue and updated them to PostreSQL database
- Translated the research volunteer recruiting program into Python from R
- Stitched pictures to make panoramic photographs by Python and Shell scripts
- Blurred human faces in panoramic photographs with OpenCV
Projects:
- Developed a system for collecting, analyzing and visualizing WiFi signal data with team of 6
- Rendered raster map tiles of WiFi signals on server using Django framework
- Detected blind spots of WiFi signal for Red Hook Initiative using our system
- Designed and coded a mobile-friendly website to introduce the project and accommodate the map UI with Bootstrap, AngularJS and Leaflet.js
- Made a web app for analyzing realtime tweets about 2016 poll candidates
- Extracted keywords and sentiment of real-time tweets via Python NLTK
- Croned analysis program and fed analysis result on Heroku using Flask
- Connected and visualized raw data and analysis result on a dashboard by d3.js
Explore Empty Taxis Problem in NYC
- Filtered all trip records of empty taxis from TLC Taxi Trip Data (20GB) by Spark
- Estimated the routes of empty taxis’ trips through R-tree by Hadoop
- Deduced the taxi zones with most empty taxis for different hours in a day and different days of a week
- Visualized empty taxis’ distribution using choropleth map via CartoDB
Evaluate the Sentiment Intensity of Words Used on Yelp’s Review
- Cleaned and munged Yelp’s business information and customer review data
- Implemented and optimized Pegasos algorithm for sentiment analysis in Python
- Designed an algorithm to evaluate the intensity level of popular words used to express sentiments for different types of restaurants
- Concluded the most notable customer concerns for each type of restaurant based on the sentiment score of reviews
Other Experiences:
Process Engineer Internship, DuPont
- Collaborated with 5 engineers in team to identify the risks in production process
- Conducted calculations referring to mass transfer and fluid mechanics problems
- Researched technical information on heat exchangers, flow meters and pumps
- Studied DuPont standard operation process and SHE management
- Edited the high risk activities managing process of the plant
Education Backgound:
New York University:
- M.S. in Applied Urban Informatics
University at Buffalo, The State University of New York:
- B.S. in Chemical Engineering
- B.A. in Mathematics