Documentation Time
I'm struggling with making my first draft to be as high-quality as I would like it, especially with my time limitations before and during break. However, getting comments on what I have right now will be very good for future writing as well. I'm currently running performance tests on the system. My current test case is a long one. It is for 3 dimensions, and runs over 25 requests for each of average cached hits of 0%, 10%, ..., 90%, and 100%. Then, I'll compile this information into pretty little charts in Excel, where I have tables waiting to be filled.
An important note: I had to ditch Hibernate for persistence. When I got it working, I was getting terrible results for timing, and it was almost purely to the fault of the persistence mechanism. So, I rebuilt my R-Tree with flat files (during a frantic code session last Saturday) but have been getting very positive results since then. Here are a few charts I have compiled based on other tests. These charts are based on my timing method, that sleeps for a constant time at the start, and then iterates over the volume of the request, sleeping for a linear value. The constant simulates overhead of data requests, while the linear simulates the time of the calculation and other performance issues rising from the size of the request. A larger linear value should have better results for the cache, while a larger constant will have less (since the constant will be multiplied by the number of uncached regions). The first two have a constant of 50, linear of 3, while the second pair has linear of 20.



4 Comments:
YAY! Now you can relax!
What was the overhead added by using a persistence architecture?
Some high-level discussion of that would be useful for future work, as the inclusion of a persistence architecture may end up being necessary in a distributed production environment, where the overhead of such a system would be negligible.
If this aspect of your test is modular enough that you can easily set a flag to force use of JPA (hibernate, toplink etc.) or your flat files, that might make for useful data comparison.
The overhead required to load my R-Tree implementation from persistence was regularly more costly than just running the calculation. When looking at the relative response times, my best performance was when it was completely uncached, but my worst reached up to 7 times as long as the uncached version. If persistence is the way to go, it requires something other than my R-Tree.
Since the flat files were a quick switch near the end, not a lot of thought was put into the transactions or distribution. The core idea was to get it working. Now, I need to figure out how to make this actually work in a distributed environment, and fast.
it doesn't "need" to work in a distributed environment, that can be left as a discussion for the actual paper. I would focus my efforts on getting the analysis complete and the write up going. If you need help on any sectons deal with architecture or distribution let me know.
Post a Comment
<< Home