Objective
The goal of the project is to gain hands-on experience in building and deploying a scalable web service on the Internet. Using the latest web technologies while learning how to tackle the scalability and fault-tolerance concerns. This is a "learn by doing" course: the course project will form the primary focus of the course with the lectures and discussion of research papers providing background material. Each project will be conducted in an agile team where students will build their own scalable, redundant web site using fundamental web technologies and the Ruby on Rails framework.
Getting Started
- Follow Chapter 1 of the book (Agile Web Development with Rails 4) for installation instructions of Rails for Windows, Mac, or Linux
- Read the list of project suggestions
- Add your own project suggestions
This Year's Projects
- A scalable automatic grading system.
- A competition tracking application.
- Share programming projects like MIT's Scratch.
- Find and compare nearby surf spots.
- Upload and share viral videos.
- Send photos to strangers, and upvote/downvote them.
- Questmaster: gamification of everyday tasks.
- Find parties near you and share yours.
- An electronic cabin guest book.
- Suppr: share meals with friends.
Project Ideas
- YCombinator inspired: YC Request for Startups Implement some portion of one of the YCombinator "Startup ideas we would like to fund"
- Government data project. A system that uses the large amounts of data at data.gov or the amazon public data sets, see the sunlight foundation projects for some ideas.
- Leverage the data from the New York Times Developer APIs in order to build something interesting. They have APIs convering geography, movie reviews and more.
- Embrace the sharing economy! Build a time-sharing app for pets. Own 30% of a dog.
- Stock trade advisor. A system that gathers information about stocks, stock trades, and companies from both traditional and non-traditional sources (blogs, email lists, twitter feeds, facebook) and computes interesting data. Potential interesting data would be correlations between stock price and both non-traditional data, trending information based on non-traditional sources. Could also include social aspects for submitting sources, voting for impact of source, etc.
Project Sprint Schedule
-
Sprint -1. Starts Oct 8, 2014.
- Install Rails
- Learn Ruby
- Do Ruby Code Academy
- Learn Rails
- Read Chapters one through eight in Agile Web Development with Rails
-
Sprint 0. Starts Oct 15, 2014.
- Read chapters nine through seventeen in Agile Web Development with Rails
- Learn TDD
- Learn Pairing
-
Sprint 1: Starts Oct 22, 2014.
- Form Groups
- Decide on Projects
- Basic user stories and page flow diagram
- Basic project planning
- Enter stories in pivotal tracker
-
Sprint 2: Starts Oct 29, 2014.
- Implement initial set of functionality
- Implement user accounts and authentication
- Use small dataset for development
- Use MySQL database
- Demo your web site on an instance of Amazon's Elastic Compute Cloud
- Learn EC2 and Amazon Web Console
-
Sprint 3: Starts Nov 5, 2014.
- Implement next set of functionality
- Have capistrano deployment scripts to automate loading the production database and deploying production code
-
Sprint 4: Starts Nov 12, 2014.
- Finish implementing general functionality
- Describe the "critical path" for scalability, which is the sequence of pages that you expect most users to go through. This is the set of pages that you will optimize, scale and benchmark
- Create medium-large dataset (about 10,000 records)
-
Sprint 5: Starts Nov 19, 2014.
- Fully implement the features that are exercised by the "critical path"
- It is important that the pages have all the elements/data required, which ensures that all the database accesses that need to occur actually do occur
- For each of the critical sequences, list the database operations that are issued in the production environment
-
Sprint 6: Starts Nov 26, 2014.
- Create large dataset (greater than 100,000 records)
- Launch a load-generator instance on EC2 and use httperf on it to load-up your web site. Plot response time (sec/req) and throughput (req/sec) as you increase the load
- Push the envelope using a single server without compromising scalability
- Run multiple instances and test the scalability of your system
- Begin scaling experiments and optimizations. Document what you learn
- Continue to optimize and document results
- Implement caching and architectural changes to eliminate bottlenecks (database, application, http), eventual consistency, etc
-
Sprint 7: Starts Dec 3, 2014.
- Final performance experiments
- Document measurements
- Test for Availability
- Writeup results on team page
- Prepare final presentation
-
Final Presentation date TBD
- Final Presentations