среда, 1 августа 2012 г.

course: distributed-algirithms-and-systems

SOURSES
    - NoSQL Summer Reding List
    - NoSQL paper


Persistent Key/Value storage
    High-performance transaction system applications typically insert rows in a History table to provide an activity trace; at the same time the transaction system generates log records for purposes of system recovery. Both types of generated information can benefit from efficient indexing. 
        - "QUOTE: The manner in which Bigtable uses memtables and SSTables to store updates to tablets is analogous to the way that the Log-Structured Merge Tree [ 26 ] stores updates to index data. In both systems, sorted data is buffered in memory before being written to disk, and reads must merge data from memory and disk."

    - LSM-tree:
        - "The Log-Structured Merge-Tree (LSM-Tree) [origin]"

    - Fractal-tree, LSM-tree, B+tree (B+tree:InnoDB, LSM:Cassandra/BigTable, TokuDB:Fractal-tree)
    - Sorted String Table (SSTable) or B+ Tree for a Database Index? (CouchDB:B+Tree, Cassandra:SSTable)
    - "Performance Data For LevelDB, Berkley DB And BangDB For Random Operations"

    - transactional Key/Value
    - BTREE + HASH
    - SSTable and MemTable
    - Aries algorithm for WAL
    - random read/write, sequential read and writes, batch operations
    - compression

SQL
    SQLite
    MySQL(InnoDB)
    MySQL(MyISAM)
    Postgres
    MS SQL
    Oracle
    DB2
    Greenplum

NoSQL
    GFS / HDFS
    BigTable / HBase
    MapRewduce / Hadoop
    Cassandra    
    Dynamo
    MongoDB
    CouchDB
    Riak
    Megastore
Gmail, Picasa, Google Calendar, the Android Market and its AppEngine cloud all use Megastore

NewSQL
    F1 / 
    Spanner (Paxos, TrueTime)
        - wiki
            - Spanner: Google’s Globally-Distributed Database
            - Google reveals Spanner, the database tech that can span the planet


Distributed Shared Memory
    Topics:
        - UDP
        - JGroups
        - JXTA

    - types:
        - time:
            - sync
            - half-sync
            - a-sync
        - consistency
            + абсолютно консистентные (DSM)
            - even cons
            - кое-как (p2p)
        - fault types
    - memory consistency models
    - DSM over message passing
    - latency vs consistency dilemma
    - lab: realize diff DSM over MP
    - lab: what level for HttpSession replication?
    - lab: business logic (relational model) over diff mem consistency models
    - ?: примеры промышленных систем с DSM (Distr opensource caches, IMDG, Tomcat session replication)?


    - lab: MMORPG
    - lab: Bussines data in event consistent storage
    - lab: realize even consistent over MP


Eventually consistent storage
    - CAP theorem
    - разные типы поломок
    - eventually consistent systems: Amazon Dynamo/Cassandra/?(Erlang)

Group communications
    - JGroups
    - Gossip protocols
    - distr collection info (p2p networks)
    - membership protocols
    - lab: realize p2p


Distributed commit
    - types
        - sync    
        - half-sync
        - async
        - FLP result
    - 2PC
        - XA spec, CORBA
        - J2EE: JTA, JTS, OTS
        - algorithm
    - 3PC
        - algorithm
    - Paxos commit
        - Paxos algorithm
        - Paxos commit algorithm

Distributed time

Distributed Data Structures

Distributed Coordination

Queue solutions