NetBSD-SoC: Improving RAIDframe parity handling
What is it?
The problem: when a NetBSD system using
software RAID for
protection against disk failures is shut down uncleanly (e.g.,
in case of a power failure or crash), then when it reboots it must
rewrite all of the redundancy information on the disks — a
process that can take many hours and impose a substantial I/O load on
the system — even though only a very small portion of the data
(or perhaps none at all) actually needs it.
The project, then, is to modify the RAID driver (raid(4), an
adaptation of the RAIDframe project undertaken at
CMU in the mid-1990s) to keep
better track of which parts of the disks are currently being written
to, and thus may need to be cleaned up in case of a crash.
However, if the bookkeeping information needed to achieve that goal is
written to the disks too frequently, it will interfere with the
writing and reading of the actual data and reduce performance all of
the time, not just after a crash.
2009-06-16: It's alive! And the most blatant bugs are gone, but
chances are it's still doing something wrong. More testing is needed.
- April 21, 2009: Community Bonding Period -- Students get to know mentors, read documentation, get up to speed to begin working on their projects.
- May 23, 2009: Students begin coding for their GSoC projects; Google begins issuing initial student payments
- July 6, 2009: Mentors and students can begin submitting mid-term evaluations.
- July 13, 2009: Mid-term evaluation deadline; Google begins issuing mid-term student payments provided passing student survey is on file.
- August 10, 2009: Suggested 'pencils down' date. Take a week to scrub code, write tests, improve documentation, etc.
- August 17, 2009: Firm 'pencils down' date. Mentors, students
and organization administrators can begin submitting final evaluations to
- August 24, 2009: Final evaluation deadline; Google begins issuing student and mentoring organization payments provided forms and evaluations are on file.
Mandatory (must-have) components:
- Must substantially reduce parity resync times.
- Should not impose excessive overhead on common use (e.g., FFS).
- Must be completely compatible with existing on-disk format.
Optional (would-be-nice) components:
- Less overhead for random I/O to the raw disk.
- Convert the initial parity writing of a newly created RAID to use this framework, thus letting it save its place if interrupted by a reboot.
- Reconstruction of a failed disk could also save its place if interrupted by a reboot.
- Replace raid(4)'s use of deprecated kernel locking facilities; this could potentially even improve performance in some cases.
(This section will be populated at a later stage in the project.)
| Jed Davis <jld@NetBSD.org> |
| $Id: index.html,v 1.4 2009/06/16 05:58:08 jlpd Exp $ |