1) Send Schema
http://docs.google.com/Doc?docid=dfn4hjr3_156gsj5k89b&hl=en
I posted this once before, but have since made one major update.
I'm leaning towards generating graphing data dynamically from the DB. This means I need to store the bug counts for the cross section of each bug list for each person at least once a day. See the the Table: project_person_bug_list_count section of the schema document for details.
I posted this once before, but have since made one major update.
I'm leaning towards generating graphing data dynamically from the DB. This means I need to store the bug counts for the cross section of each bug list for each person at least once a day. See the the Table: project_person_bug_list_count section of the schema document for details.
2) Total quantity of data - # of rows
http://spreadsheets.google.com/ccc?key=pwxRLPxLbuoIXbKN4R_3Meg
The spreadsheet shows each table and the expected number of rows it will contain. I was very conservative with my estimates and I'd be stunned if the app grew this much. That said, if I was good at estimating application demand, I wouldn't be doing this project in the first place.
The key take away from the information is that all of the configuration and site rendering records for 500 projects comes to ~ half a million rows. BUT, when I add in the record of the bug counts for every person and every list (see schema doc) that adds 11 billion rows over the course of a year.
Is 11 billion rows a lot in terms of a MySQL table?
If so I'll look into a way to collapse that data. With some post processing it would be possible reduce the number of rows to one per person per bug list by creating a comma seperated string of bug counts.
The spreadsheet shows each table and the expected number of rows it will contain. I was very conservative with my estimates and I'd be stunned if the app grew this much. That said, if I was good at estimating application demand, I wouldn't be doing this project in the first place.
The key take away from the information is that all of the configuration and site rendering records for 500 projects comes to ~ half a million rows. BUT, when I add in the record of the bug counts for every person and every list (see schema doc) that adds 11 billion rows over the course of a year.
Is 11 billion rows a lot in terms of a MySQL table?
If so I'll look into a way to collapse that data. With some post processing it would be possible reduce the number of rows to one per person per bug list by creating a comma seperated string of bug counts.
3) Block or word diagram on how to aggregate data
http://docs.google.com/Doc?id=dfn4hjr3_174gq9pm9gq
This is a pretty basic explanation of what happens when you try to collect bug data.
This is a pretty basic explanation of what happens when you try to collect bug data.
4) Create a realistic schedule - research and deliverables
I've created a google calendar. There's not much on it yet.
My goal is (in the next two week) to:
My goal is (in the next two week) to:
- Get my development environment set up
- Create the model and migrations written based on the schema I've designed.
- prototype the migration script
One I have the migration script I'll have a real life test of mapping the existing data from the legacy system to the new system. This should uncover any major gaps I have in transitioning between the systems.
Following that I'll go after one of the following:
Following that I'll go after one of the following:
- Prototype messaging layer for communication between the central controller and the generator
- Research changes required to the generator
- Research access to NFS
- Development of Central Controller site configuration pages
No comments:
Post a Comment