Google has donated Hibernate Shards - a Java 5 add-on to Hibernate that allows application driven data partitioning with your custom or their pre-build configurations. One thing that caught my eye was that when they covered partitioned Native ID generation (aka Identity), they mentioned having Database A use a range 0-200000, Database B use range 200001 - 400000, etc. I know that MySQL cluster suggests using something like this:
Database A: Starting ID 1, Increment ID by 3 (number of databases)
Database B: Starting ID 2, Increment ID by 3
Database C: Starting ID 3, Increment ID by 3
This allows a hands-off approach, and easily lets you divide ID by 3 and use the remainder to map to a "Shard". Of course, if your data changes significantly, you may have to dump and reload to add another database. One way to avoid a dump and reload is to pick an increment that is larger than your actual database count, and just drop an extra database in an the available Starting ID. This can eat up your "keyspace" faster, but if you don't have a huge amount of data, you don't need partitioning that bad anyway.
If you have been reading my Blog, you know that I use NHibernate, but I am confident that the techniques that they are using are portable to .Net. I can see some value in using this for year-based partitioning, where archives are made available as read-only data, and the new database is created with the next available ID.
Database A: Starting ID 1, Increment ID by 3 (number of databases)
Database B: Starting ID 2, Increment ID by 3
Database C: Starting ID 3, Increment ID by 3
This allows a hands-off approach, and easily lets you divide ID by 3 and use the remainder to map to a "Shard". Of course, if your data changes significantly, you may have to dump and reload to add another database. One way to avoid a dump and reload is to pick an increment that is larger than your actual database count, and just drop an extra database in an the available Starting ID. This can eat up your "keyspace" faster, but if you don't have a huge amount of data, you don't need partitioning that bad anyway.
If you have been reading my Blog, you know that I use NHibernate, but I am confident that the techniques that they are using are portable to .Net. I can see some value in using this for year-based partitioning, where archives are made available as read-only data, and the new database is created with the next available ID.
Comments
Max (Hibernate Shards developer)