Discussion:
ZFS monitoring
(too old to reply)
Borja Marcos
2013-02-11 15:53:10 UTC
Permalink
Hello,

I'n updating Devilator, the performance data collector for Orca and FreeBSD to include ZFS monitoring. So far I am graphing the ARC and L2ARC size, L2ARC writes and reads, and several hit/misses data pairs.

Any suggestions to improve it? What other variables can be interesting?

An example of the current state of the program is here:

http://devilator.frobula.com

Thanks,





Borja.
Tim Cook
2013-02-11 15:56:14 UTC
Permalink
Post by Borja Marcos
Hello,
I'n updating Devilator, the performance data collector for Orca and
FreeBSD to include ZFS monitoring. So far I am graphing the ARC and L2ARC
size, L2ARC writes and reads, and several hit/misses data pairs.
Any suggestions to improve it? What other variables can be interesting?
http://devilator.frobula.com
Thanks,
Borja.
The zpool iostat output has all sorts of statistics I think would be
useful/interesting to record over time.

--Tim
Borja Marcos
2013-02-11 16:14:30 UTC
Permalink
The zpool iostat output has all sorts of statistics I think would be useful/interesting to record over time.
Yes, thanks :) I think I will add them, I just started with the esoteric ones.

Anyway, still there's no better way to read it than running zpool iostat and parsing the output, right?





Borja.
Jim Klimov
2013-02-11 16:39:27 UTC
Permalink
Post by Borja Marcos
The zpool iostat output has all sorts of statistics I think would be useful/interesting to record over time.
Yes, thanks :) I think I will add them, I just started with the esoteric ones.
Anyway, still there's no better way to read it than running zpool iostat and parsing the output, right?
I believe, in this case you'd have to run it as a continuous process
and parse the outputs after the first one (overall uptime stat, IIRC).
Also note that on problems with ZFS engine itself, "zpool" may lock up
and thus halt your program - so have it ready to abort an outstanding
statistics read after a timeout and perhaps log an error.

And if pools are imported-exported during work, the "zpool iostat"
output changes dynamically, so you basically need to parse its text
structure every time.

The "zpool iostat -v" might be even more interesting though, as it lets
you see per-vdev statistics and perhaps notice imbalances, etc...

All that said, I don't know if this data isn't also available as some
set of kstats - that would probably be a lot better for your cause.
Inspect the "zpool" source to see where it gets its numbers from...
and perhaps make and RTI relevant kstats, if they aren't yet there ;)

On the other hand, I am not certain how Solaris-based kstats interact
or correspond to structures in FreeBSD (or Linux for that matter)?..

HTH,
//Jim Klimov
Pawel Jakub Dawidek
2013-02-12 10:25:04 UTC
Permalink
Post by Jim Klimov
Post by Borja Marcos
The zpool iostat output has all sorts of statistics I think would be useful/interesting to record over time.
Yes, thanks :) I think I will add them, I just started with the esoteric ones.
Anyway, still there's no better way to read it than running zpool iostat and parsing the output, right?
I believe, in this case you'd have to run it as a continuous process
and parse the outputs after the first one (overall uptime stat, IIRC).
Also note that on problems with ZFS engine itself, "zpool" may lock up
and thus halt your program - so have it ready to abort an outstanding
statistics read after a timeout and perhaps log an error.
And if pools are imported-exported during work, the "zpool iostat"
output changes dynamically, so you basically need to parse its text
structure every time.
The "zpool iostat -v" might be even more interesting though, as it lets
you see per-vdev statistics and perhaps notice imbalances, etc...
All that said, I don't know if this data isn't also available as some
set of kstats - that would probably be a lot better for your cause.
Inspect the "zpool" source to see where it gets its numbers from...
and perhaps make and RTI relevant kstats, if they aren't yet there ;)
On the other hand, I am not certain how Solaris-based kstats interact
or correspond to structures in FreeBSD (or Linux for that matter)?..
I made kstat data available on FreeBSD via 'kstat' sysctl tree:

# sysctl kstat
--
Pawel Jakub Dawidek http://www.wheelsystems.com
FreeBSD committer http://www.FreeBSD.org
Am I Evil? Yes, I Am! http://tupytaj.pl
Borja Marcos
2013-02-12 11:00:42 UTC
Permalink
Yes, I am using the data. I wasn't sure about how getting something meaningful from it, but I've found the arcstats.pl script and I am using it as a model.

Suggestions will be always welcome, though :)

(the sample pages I put on devilator.froblua.com aren't using the better organized graphs, though, it's just a crude parameter dump)





Borja.

Sašo Kiselkov
2013-02-11 16:27:45 UTC
Permalink
Post by Borja Marcos
Hello,
I'n updating Devilator, the performance data collector for Orca and FreeBSD to include ZFS monitoring. So far I am graphing the ARC and L2ARC size, L2ARC writes and reads, and several hit/misses data pairs.
Any suggestions to improve it? What other variables can be interesting?
http://devilator.frobula.com
Hi Borja,

I've got one thing up for review in Illumos for upstreaming: #3137 L2ARC
Compression. This adds another kstat called l2_asize, which tells you
how big the L2ARC actually is, taking into account compression, e.g.

# kstat -n arcstats 5 | egrep '(\<l2_size\>|\<l2_asize\>)'
l2_asize 25708032
l2_size 29117952

You can use this to track L2ARC compression efficiency, etc. If the
kstat is not present, then L2ARC compression isn't available.

Anyway, just a quick thought.

--
Saso
Loading...