asnaedae @ home
random musings from a twisted mind
random musings from a twisted mind
Dec 30th
Well, been playing with opensolaris and they use mercurial as a DSVN – and another benefit of that over git is that the command set is very much similar to SVN/Subversion – which is important for us people who /don’t/ use it every day.
Migrating was straight-forward:
$ mkdir ~/svn && cd ~/svn
$ hgimportsvn https://dubdubdub.co.uk/svn/mike
$ find . -name .svn -type d | xargs rm -rf
One thing to note when using – which took me a bit of reading to realise – it’s distributed, so if you’ve your own local copy – you have to commit and then push/pull changes out!
Oct 25th
well this is slightly surprising, but in a very good way, and does lead to some interesting suggestions on how to best to improve matters, but look at the following graph of FAST ESP query latency:
Notice that the average latency drops as we use the server more . . . but WHY?
Well that’s just because we’re running the FAST indexes on a ZFS based file system and the L2 ARC cache is making it’s presence felt
# arcstat.pl
Time read miss miss% dmis dm% pmis pm% mmis mm% arcsz cur
11:25:52 13G 263M 1 158M 1 104M 15 44M 11 2G 2G
11:25:53 29K 103 0 97 0 6 2 2 15 2G 2G
11:25:54 10K 161 1 156 1 5 13 1 9 2G 2G
11:25:55 10K 197 1 174 1 23 18 3 50 2G 2G
Of course, I’d really like to try playing with a few Enterprise grade SSDs to supplement the L2 ARC – should be able to soak most of the “hot” data from SSD without going back to the spinning rust (admittedly the full index data set is only 80GB)
*patiently waits for Sun to get their fingers out*
Update
Using Ben Rockwood’s arc_summary.pl tool we get the following view into the ARC cache:
ARC Size:
Current Size: 4659 MB (arcsize)
Target Size (Adaptive): 4659 MB (c)
Min Size (Hard Limit): 1023 MB (zfs_arc_min)
Max Size (Hard Limit): 31735 MB (zfs_arc_max)
ARC Size Breakdown:
Most Recently Used Cache Size: 29% 1393 MB (p)
Most Frequently Used Cache Size: 70% 3266 MB (c-p)
ARC Efficency:
Cache Access Total: 1464939081
Cache Hit Ratio: 91% 1342983472 [Defined State for buffer]
Cache Miss Ratio: 8% 121955609 [Undefined State for Buffer]
REAL Hit Ratio: 78% 1146170142 [MRU/MFU Hits Only]
Data Demand Efficiency: 91%
Data Prefetch Efficiency: 86%
CACHE HITS BY CACHE LIST:
Anon: 10% 142482495 [ New Customer, First Cache Hit ]
Most Recently Used: 5% 74249410 (mru) [ Return Customer ]
Most Frequently Used: 79% 1071920732 (mfu) [ Frequent Customer ]
Most Recently Used Ghost: 1% 19996413 (mru_ghost) [ Return Customer Evicted, Now Back ]
Most Frequently Used Ghost: 2% 34334422 (mfu_ghost) [ Frequent Customer Evicted, Now Back ]
CACHE HITS BY DATA TYPE:
Demand Data: 53% 712758575
Prefetch Data: 17% 241164086
Demand Metadata: 20% 280805976
Prefetch Metadata: 8% 108254835
CACHE MISSES BY DATA TYPE:
Demand Data: 52% 64233340
Prefetch Data: 30% 36667246
Demand Metadata: 15% 19272211
Prefetch Metadata: 1% 1782812
Oct 6th
well we made thai massaman curry over the weekend, and it it was pretty good – only downside is that we didn’t make enough rice for a change! But the rice cooker took care of that.
ingredients – for two
1) put potatoes in steamer for 20 minutes to cook
2) heat up pan, add oil and soften cut onion for 5 minutes
3) add chicken and brown for 3 minutes, add 2 tspn of massaman curry paste – cook for further 6 minutes
4) add chicken stock and coconut milk and cover with saucepan lid
5) cook for 30 minutes, add cooked potatoes and peanuts and cook for further 10 minutes
6) serve
Sep 29th

# tcpdump -ni en0 port 80 -w output.trace
# tcptrace -G output.trace
# xplot *tput.xpl
From the online manpage:
Other useful graphs:
Just some notes here so I don’t forget the basics – manual over at here.
Sep 26th
Following reply by iljitsch van Beijnum about queueing delays in IP, looked to be a good little summary.
The answer is that delay is only one aspect of performance, another important one is packet loss. As link bandwidth increases, queuing delays decrease proportionally. So if you’re using your 10 Mbps link with average 500 byte packets at 98% capacity, you’ll generally have a 49-packet queue. (queue = utilization / (1 – utilization)) Our 500 byte packets are transmitted at 0.4 ms intervals, so that makes for a 19.6 ms queuing delay.
So now we increase our link speed to 100 Mbps, but for some strange reason this link is also used at 98%. So the average queue size is still 49 packets, but it now only takes 0.04 ms to transmit one packet, so the queuing delay is only 1.96 ms on average.
As you can see, as bandwidth increases, queuing delays become irrelevant. To achieve even 1 ms queuing delay (that’s only 120 miles extra fiber) at 10 Gbps you need an average queue size of 833 even with 1500-byte packets. For this, you need a link utilization of almost 99.9%.
However, due to IP’s bursty nature the queue size is quite variable. If there is enough buffer space to accommodate whatever queue size that may be required due to bursts, this means you get a lot of jitter. (And, at 10 Gbps, expensive routers, because this memory needs to be FAST.) On the other hand, if the buffer space fills up but packets keep coming in faster than they can be transmitted, packets will have to be dropped. As explained by others, this leads to undesired behavior such as TCP congestion synchronization when packets from different sessions are dropped and poor TCP performance when several packets from the same session are dropped. So it’s important to avoid these “tail drops”, hence the need for creative queuing techniques.
However, at high speeds you really don’t want to think about this too much. In most cases, your best bet is RED (random early detect/drop) which gradually drops more and more packets as the queue fills up (important: you need to have enough buffer space or you still get excessive tail drops!) so TCP sessions are throttled back gradually rather than traumatically. Also, the most aggressive TCP sessions are the most likely to see dropped packets. With weighted RED some traffic gets a free pass up to a point, so that’s nice if you need QoS “guarantees”. (W)RED is great because it’s not computationally expensive and only needs some enqueuing logic but no dequeuing logic.