한스 라이저 아저씨의 일로, 라이저 파일시스템의 발전에 지장이 있지 않을까 싶었었는데,

라이저 파일 시스템을 능가할 새로운 파일 시스템의 개발이 착착 진행되고 있다. 바로 해머(HAMMER) 파일 시스템!!!


아직은 개발중이라 자세한 내막은 모르겠지만, 나름 열심히 진행하고 있는것 같다. 2007년 초에 시작해서 진행되고 있는데, 최근 커널 뉴스 그룹에 다음과 같은 글을 올렸다.

HAMMER update - 15 nov 2007
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Matthew Dillon 
To: 
Subject: HAMMER update - 15 nov 2007
Date: Thursday, November 15, 2007 - 8:54 pm

HAMMER work is still progressing well, I hope to have most of it
    working in a degenerate single-cluster (64MB filesystem) case by the
    end of next week.  (cluster == 64MB block of the disk, not cluster as
    in clustering).

    Gluing the per-cluster B-Tree's together for the multi-cluster case
    is turning out to be more of a headache and will probably take at
    least 2 weeks to get working.  Some fairly sophisticated heuristics
    will be needed to avoid unnecessary copying between clusters.

    I may decide to move the 2.0 release to mid-January to give myself some
    more time.  This is similar to what we did for 1.8.  Also, I think a
    January release is better then a Christmas release because people get
    busy with christmas-like things.  I want the filesystem to be at least
    beta quality as of the release and I don't think its possible to get it
    there by mid-December.

						-Matt
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

크리스마스보다는 1월 중순쯤에 2.0을 내놓을 것 같다고 하니 빨리 결과를 볼 수 있는 날이 오길 기대해본다.
참고로, 8월에 DragonFlyBSD 커널 메일링 리스트에 올려 놓은 HAMMER 디자인 문서를 같이 올려놔 본다.

HAMMER filesystem update - design document

From:	 Matthew Dillon 
Date:	 Wed, 10 Oct 2007 12:33:45 -0700 (PDT)
    Ok, here's the final design document that I am now implementing.
    Again, I expect most or all of these features to be ready and the
    filesystem to be beta-quality by the December release.


			       Hammer Filesystem

(I) General Storage Abstraction

    HAMMER uses a basic 16K filesystem buffer for all I/O.  Buffers are
    collected into clusters, cluster are collected into volumes, and a
    single HAMMER filesystem may span multiple volumes.

    HAMMER maintains a small hinted radix tree for block management in
    each layer.  A small radix tree in the volume header manages cluster
    allocations within a volume, one in the cluster header manages buffer
    allocations within a cluster, and most buffers (pure data buffers
    excepted) will embed a small tree to manage item allocations within
    the buffer.

    Volumes are typically specified as disk partitions, with one volume
    designated as the root volume containing the root cluster.  The root
    cluster does not need to be contained in volume 0 nor does it have to
    be located at any particular offset.

    Data can be migrated on a cluster-by-cluster or volume-by-volume basis
    and any given volume may be expanded or contracted while the filesystem
    is live.   Whole volumes can be added and (with appropriate data
    migration) removed.

    HAMMER's storage management limits it to 32768 volumes, 32768 clusters
    per volume, and 32768 16K filesystem buffers per cluster.   A volume
    is thus limited to 16TB and a HAMMER filesystem as a whole is limited
    to 524288TB.  HAMMER's on-disk structures are designed to allow future
    expansion through expansion of these limits.  In particular, the volume
    id is intended to be expanded to a full 32 bits in the future and using
    a larger buffer size will also greatly increase the cluster and volume
    size limitations by increasing the number of elements the buffer-
    restricted radix trees can manage.

    HAMMER breaks all of its information down into objects and records.
    Records have a creation and deletion transaction id which allows HAMMER
    to maintain a historical store.  Information is only physically deleted
    based on the data retention policy.  Those portions of the data retention
    policy affecting near-term modifications may be acted upon by the live
    filesystem but all historical vacuuming is handled by a helper process.

    All information in a HAMMER filesystem is CRCd to detect corruption.

(II) Filesystem Object Topology

    The objects and records making up a HAMMER filesystem is organized into
    a single, unified B-Tree.  Each cluster maintains a B-Tree of the
    records contained in that cluster and a unified B-Tree is constructed by
    linking clusters together.  HAMMER issues PUSH and PULL operations
    internally to open up space for new records and to balance the global
    B-Tree.  These operations may have the side effect of allocating
    new clusters or freeing clusters which become unused.

    B-Tree operations tend to be limited to a single cluster.  That is,
    the B-Tree insertion and deletion algorithm is not extended to the
    whole unified tree.  If insufficient space exists in a cluster HAMMER
    will allocate a new cluster, PUSH a portion of the existing
    cluster's record store to the new cluster, and link the existing
    cluster's B-Tree to the new one.

    Because B-Tree operations tend to be restricted and because HAMMER tries
    to avoid balancing clusters in the critical path, HAMMER employs a
    background process to keep the topology as a whole in balance.  One
    side effect of this is that HAMMER is fairly loose when it comes to
    inserting new clusters into the topology.

    HAMMER objects revolve around the concept of an object identifier.
    The obj_id is a 64 bit quantity which uniquely identifies a filesystem
    object for the entire life of the filesystem.  This uniqueness allows
    backups and mirrors to retain varying amounts of filesystem history by
    removing any possibility of conflict through identifier reuse.  HAMMER
    typically iterates object identifiers sequentially and expects to never
    run out.  At a creation rate of 100,000 objects per second it would
    take HAMMER around 6 million years to run out of identifier space.
    The characteristics of the HAMMER obj_id also allow HAMMER to operate
    in a multi-master clustered environment.

    A filesystem object is made up of records.  Each record references a
    variable-length store of related data, a 64 bit key, and a creation
    and deletion transaction id which is indexed along with the key.

    HAMMER utilizes a 64 bit key to index all records.  Regular files use
    the base data offset of the record as the key while directories use a
    namekey hash as the key and store one directory entry per record.  For
    all intents and purposes a directory can store an unlimited number of
    files. 

    HAMMER is also capable of associating any number of out-of-band
    attributes with a filesystem object using a separate key space.  This
    key space may be used for extended attributes, ACLs, and anything else
    the user desires.

(III) Access to historical information

    A HAMMER filesystem can be mounted with an as-of date to access a
    snapshot of the system.  Snapshots do not have to be explicitly taken
    but are instead based on the retention policy you specify for any
    given HAMMER filesystem.  It is also possible to access individual files
    or directories (and their contents) using an as-of extension on the
    file name.

    HAMMER uses the transaction ids stored in records to present a snapshot
    view of the filesystem as-of any time in the past, with a granularity
    based on the retention policy chosen by the system administrator. 
    feature also effectively implements file versioning.

(IV) Mirrors and Backups

    HAMMER is organized in a way that allows an information stream to be
    generated for mirroring and backup purposes.  This stream includes all
    historical information available in the source.  No queueing is required
    so there is no limit to the number of mirrors or backups you can have
    and no limit to how long any given mirror or backup can be taken offline.
    Resynchronization of the stream is not considered to be an expensive
    operation.

    Mirrors and backups are maintained logically, not physically, and may
    have their own, independant retention polcies.  For example, your live
    filesystem could have a fairly rough retention policy, even none at all,
    then be streamed to an on-site backup and from there to an off-site
    backup, each with different retention policies.

(V) Transactions and Recovery

    HAMMER implement an instant-mount capability and will recover information
    on a cluster-by-cluster basis as it is being accessed.

    HAMMER numbers each record it lays down and stores a synchronization
    point in the cluster header.  Clusters are synchronously marked 'open'
    when undergoing modification.  If HAMMER encounters a cluster which is
    unexpectedly marked open it will perform a recovery operation on the
    cluster and throw away any records beyond the synchronization point.

    HAMMER supports a userland transactional facility.  Userland can query
    the current (filesystem wide) transaction id, issue numerous operations
    and on recovery can tell HAMMER to revert all records with a greater
    transaction id for any particular set of files.  Multiple userland
    applications can use this feature simultaniously as long as the files
    they are accessing do not overlap.  It is also possible for userland
    to set up an ordering dependancy and maintain completely asynchronous
    operation while still being able to guarentee recovery to a fairly
    recent transaction id.

(VI) Database files

    HAMMER uses 64 bit keys internally and makes key-based files directly
    available to userland.  Key-based files are not regular files and do not
    operate using a normal data offset space.

    You cannot copy a database file using a regular file copier.  The
    file type will not be S_IFREG but instead will be S_IFDB.   The file
    must be opened with O_DATABASE.  Reads which normally seek the file
    forward will instead iterate through the records and lseek/qseek can
    be used to acquire or set the key prior to the read/write operation.

사용자 삽입 이미지
크리에이티브 커먼즈 라이선스
Creative Commons License
이올린에 북마크하기(0) 이올린에 추천하기(0)
Posted by Daniel Kwon