1 |
dpavlin |
23 |
Sack - sharding memory hash in perl |
2 |
|
|
|
3 |
|
|
Main design goal is to have interactive environment to query |
4 |
|
|
perl hashes which are bigger than memory on single machine. |
5 |
|
|
|
6 |
|
|
It implemented using TCP sockets between perl processes. |
7 |
|
|
This allows horizontal scalability both on multi-core machines |
8 |
|
|
as well as across the network to additional machines. |
9 |
|
|
|
10 |
|
|
Reading data into hash is done using any perl module which |
11 |
|
|
returns perl hash and supports offset and limit to select just |
12 |
|
|
subset of data (this is required to create disjunctive shards). |
13 |
|
|
|
14 |
|
|
Views are small perl snippets which are called for each record |
15 |
|
|
on each shard with $rec. Views create data in $out hash which |
16 |
|
|
is automatically merged in output. |
17 |
|
|
|
18 |
dpavlin |
36 |
You can influence default shard merge by adding + (plus sign) |
19 |
|
|
in name of your key to indicate that key => values pairs below |
20 |
|
|
should have sumed values when combining shards. |
21 |
dpavlin |
23 |
|
22 |
dpavlin |
36 |
If you have long field names, add # to name of key above value |
23 |
dpavlin |
128 |
which you want to turn into integer value. This will reduce |
24 |
|
|
memory usage on master node. |
25 |
dpavlin |
197 |
|
26 |
|
|
|
27 |
|
|
USAGE |
28 |
|
|
|
29 |
|
|
1. create cloud definition |
30 |
|
|
|
31 |
|
|
etc/cloud-name IP addresses of nodes similar to /etc/hosts |
32 |
|
|
etc/cloud-name.ssh ssh configuration (user, compression etc) |
33 |
|
|
|
34 |
|
|
2. shard data |
35 |
|
|
|
36 |
|
|
./bin/shards.pl (hard-coded to use WebPAC::Input::ISI for now) |
37 |
|
|
|
38 |
dpavlin |
208 |
./bin/couchdb2shards.pl |
39 |
dpavlin |
197 |
|
40 |
dpavlin |
208 |
3. start server |
41 |
|
|
|
42 |
|
|
CLOUD=etc/cloud-name ./lib/Sack/Server.pm |
43 |
|
|
|
44 |
|
|
4. start repl |
45 |
|
|
|
46 |
|
|
./lib/Sack/REPL.pm |
47 |
|
|
|
48 |
|
|
5. start locally for development |
49 |
|
|
|
50 |
dpavlin |
197 |
./bin/split.sh |
51 |
|
|
|