Couchbase .NET SDK DP1 Available!

by jmorris 6. May 2014 12:31

At long last, we present the official Developer Preview 1 (DP1) for the Couchbase .NET 2.0 SDK! Previously, I posted about some of the motivation behind rewriting the Couchbase .NET SDK and also some of the goals and features you should expect to find in it.  This DP1 represents the minimal subset of features promised and should give you a feel for the direction we are taking the API from a developer’s perspective. In this post, I’ll show more.


Version 1.3.2 of the Couchbase .NET Client SDK has been released!

by jmorris 6. February 2014 12:28


Introducing the DOTNET SDK 2.0 Development Blog Series at Couchbase

by jmorris 20. January 2014 10:39

New blog introducing a series of posts I will be doing on the design and development of the new Couchbase .NET SDK 2.0 client at

The first post is a pre-cursor to rest of the series and I go over some of the history, motivation, goals and objectives, features and finally a bit of the high-level design that I will be writing about over the coming months as the new client is developed. This is intended to be an evolving, interactive series so feel free to drop a comment if you see something that you think can be improved upon.

Tags: ,


Couchbase .NET SDK 1.3.1 Released!

by jmorris 8. January 2014 14:36

Heads up that the Couchbase .NET SDK version 1.3.1 was released officially yesterday! This release is mostly a maintenance follow up to the 1.3.0 release which added significant improvements (a rewrite of) the connection pooling portion of the client.

A list of issues and tasks that were resolved:

  • NCBC-289: Does not return errors object on exception
  • NCBC-344: NotImplementedException when storing against MemcachedClient in v1.3 client
  • NCBC-345: Update Readme.mdown to reflect changes in the Couchbase .NET SDK versioning policy
  • NCBC-353: Add node IP to error messages so that users can isolate issues easier
  • NCBC-352: Flag all Increment/Decrement methods with CAS params as 'Obsolete'
  • NCBC-337: Fix for 'View vquery was mapped to a dead node, failing.' errors
  • NCBC-341: AOOR when deserializing bootstrap config with empty 'pools' element
  • NCBC-327: Update Nuspec files to current VS Solution Configuration
  • NCBC-334: Add a post-merge git hook for updating the assembly version

The release is available on Nuget: or from S3:



Assembly Versioning With Git Hooks

by jmorris 6. January 2014 17:56

Source code can be found here:

Consider the scenario where you wish to base your assembly versioning off of your commit history from Git. Ideally you would want to be able to pull some information from Git regarding the commit you wish to build from, you most definitely would want this information to follow the traditional .NET versioning semantics or something similar like, and importantly you would want it to be integrated with your build process and automated. Integrated, because once you have your process in place, you probably don’t want to waste time manually changing things every time you do a build and automated, because you don’t want human fingers touching it (and perhaps making human mistakes) every on every build cycle.

Getting Commit Information from Git

Git provides a couple of different ways of getting information regarding the current state of your repository, of prominence are “git-describe”, which will show you the most recent tag that is reachable from a commit[1] and “git-rev-parse” to print out the SHA1 of a given revision.

git describe --long shows you the “nth” commit since the last tag and the shortened SHA1 of that commit. For example, given a tag created with the following command: “$ git tag -a 1.0.0 -m "1.0.0 tag", assuming that two commits have occurred since the tag was created, git-describe with the –long parameter will output: “1.0.0-2-g1089655”:


Where 1.0.0 is your tag, 2 is the number of commits since you created the tag, and the 1089655 is a shortened SHA1 of the last commit.

Storing Git Commit Information in your Assembly

.NET Assembly versioning is fairly well documented, in a nutshell it involves the following four values that are included within a special class in your project under the properties subfolder and compiled into the manifest of your assembly:

<major>.<minor>.<build number>.<revision>

All four values are positive integers, which separated by periods: 1.201.344.1 as an example. The .NET runtime uses this information, along with other information (say from an App.config or Web.config file) to ensure that the correct version of an assembly is loaded at runtime.

The special class mentioned previously is the AssemblyInfo.cs class and it contains a number of attributes which when compiled, make up a portion of the assembly manifest, which is a collection of static data describing how the assembly elements related to each other[2].

Of importance to this blog post are the following three attributes contained within the AssemblyInfo.cs class:

  • AssemblyVersion
  • AssemblyInformationalVersion
  • AssemblyFileVersion

For our purposes the AssemblyVersion, which contains the <major>.<minor>.<build number>.<revision> values and the AssemblyInformationalVersion, which is largely a plain-text value will be the attributes, will be what we rely on to version and provide build information to the assembly.

Post Merge Git Hooks

Within every Git repo is a folder called “.git”, by default it is not visible but it’s still accessible via the command line. The “.git” folder contains another set of folders where Git stores information related to the repository:


The “hooks” file is a special folder for storing, well, hooks! What are hooks? They are special snippets of code or script that is run when certain events are triggered in git. For example, there are hooks that are run when a commit occurs, after a rebase, etc. A full overview of all of the hooks can be found on the git man pages here.

We want the version information to be updated only when a build is in process and because we only want it updated when git pull happens (otherwise the validity of the version is compromised), we will use the post-merge hook. The post-merge hook fires after a successful pull and merge. This means, however, that if the local is the same as the remote, it will not fire. This is something to consider in your solution.

Creating a post-merge hook is easy: you simply create a file named “post-merge” with no file extension and place it in the .git/hooks directory. The easiest way to do this is probably by just navigating to that folder and doing a touch:


Once you have the post-merge file in place, Git will invoke it on every pull request or fetch which results in the local repository being updated with changes from the remote. This workflow fits well in an automated build environment.

Using the Post-Merge Hook to Update the AssemblyInfo.cs File

Once you have the post-merge hook in place you can now write some script to update the version information within your .NET project.

When Git is installed, several bash tools are also installed it. One of those is probably familiar to most Linux developers or administrators is Sed, which translated means “Stream Editor” is utility for parsing and transforming text. In this case, we will use Sed and a regular expression to find the line within the AssemblyInfo.cs file we wish to update and replace it with the updated revision information obtained via the git-describe mentioned previously.

Here is the relevant code:


What this does is do a git-describe with the “—always” flag, which will always return something: a shortened SHA1 of the most recent revision if a tag is not present. The script then checks the return value of the git-describe, which is stored in the variable “tag” to see if it has a git tag associated with it. If it does then it splits the value in “tag” and concatenates them into a format compatible with .NET versioning discussed previously. The result of this is stored in another variable “version” and echoed out on the next line. This variable “version” is then swapped with the previous values in the AssemblyVersion and AssemblyFileVersion attributes using a regex and Sed. Finally, the whole git-describe value stored in “tag” is swapped with the previous value of the AssemblyInformationalVersion attribute.

After building the project and right clicking on the assembly generated, this is what we see:


The .NET assembly version stored within the assembly manifest will also be As previously discussed this loosely translates to the <major>.<minor>.<build>.<revision> semantics described above. Additionally, if you chose to you could base your tag structure off of’s guidelines and only use the last digit to designate a non-releasable test build (for example) based off the previous tag and the number of commits since it was created. On release builds you would specify a new tag with the last digit zero and the second to last digit one increment more than the previous or something to that affect.




Tags: , ,

Visual Studio | Git | Versioning

Couchbase [SF] 2013 is Coming Up!

by jmorris 30. August 2013 19:16

On September 13th 2013, leading NOSQL database provider, Couchbase, will be invading San Francisco for it’s annual community conference, Couchbase [SF] 2013! I’ll be there as well participating in a Couchbase Cluster – smaller interactive and discussion driven sessions – representing the .NET Client SDK.

The event will host three different tracks: developer, operations and administration with speakers from Couchbase and from Customers of Couchbase who have a wealth of experience and various use-cases to share. There will also be smaller interactive sessions that go over advanced topics, new offerings such as Mobile Couchbase and the various Couchbase Labs projects that are available for free on Github or via Nuget (if you are a .NET developer).

So if your in San Francisco on September 13th or are willing to travel a bit, come by and join in on the fun!

Tags: , ,

conferences | couchbase | NOSQL

Daily WTF: PHP “Annotations”

by jmorris 3. November 2012 21:08

Annotations or Attributes (Java vs. .NET/C#) are a means of decorating classes, methods and properties with additional metadata or declarative information. The annotations/attributes can then be queried at runtime via reflection and methods associated with them can be invoked.

Incredibly powerful and useful, they are quite common in various frameworks for tasks associate with say validating data associated with a class or property or for mapping properties on an entity to column names in a table in an RDBMS. There are many, many other uses as well.

Here is an example in C#:


Note that in both Java and C#, annotations/attributes are a first class language construct. This is useful for many reasons, including improved readability and comprehension, they are type-safe, you can attach a debugger and step into them, etc.

Today I learned that PHP also has a form of annotations…well, sort of! It seems that a couple of PHP frameworks (Symphony 2 and Doctrine 2) have “implemented” them not as a language construct but as a hack via comments:


Folks, those aren’t comments…that is code that will get executed! Yuck, this is wrong in so many ways…especially since there is an RFC for adding annotations to PHP in the works:

Just because you can, doesn’t mean you should!



Eclipse UX…to be desired

by jmorris 14. June 2012 10:50


Not much more to say about Eclipse…



Comparison of Various NOSQL Databases

by jmorris 25. March 2012 12:13

A couple of weeks back, while at SxSW, I attended an excellent presentation about NOSQL databases by Gary Dusabek of Rackspace: NoSQL Databases: Breaking the Relational Headlock. The following post summarizes some of the key points and provides a comparison of the various technologies. He didn’t go over CouchDb, Couchbase or Membase, so I’ll add my own notes about those offerings as well, since I personally have used each.


The Problems with RDBMS

The major problems with traditional Relational Database Management Systems is the inability to scale linearly, Single Point of Failure (SPoF), lack of sharding features, and the requirement of de-normalization to ease the use of data. Typically, to deal with scale, you would add processors, memory, disk space etc. to build a bigger box capable of handling increased volume or throughput.  This is normally referred to as “vertical scaling”. Unfortunately, vertical scaling is not cost efficient; the cost of CPU and memory increases disproportionately to performance – it’s cheaper and more efficient to cluster cheaper hardware – horizontal scaling. Additionally, RDBMS performance tends to suffer when transactions are introduced to ensure data corrected-ness, consistency and isolation.

Considerations for Choosing

When choosing a NOSQL solution the following considerations must be evaluated:

  • Fault tolerance – what is an acceptable level?
  • Recoverability – volatility (in-memory/fast) or persisted (slower, but less volatile)
  • Replication – fully distributed or master/slave?
  • Access – polyglot drivers? Do they all offer consistent functionality?
  • Hooks – before/after command execution (sprocs and triggers)?
  • Distribution mode – sharding strategy?
  • Data model
    • Key/Value pairs?
    • Documents?
    • Data structures?
    • Query-ability?
  • Transactional semantics? BASE vs ACID?
  • Read vs Write throughput – where are your scaling issues? What are the usage patterns of your data?
  • Deployment, Management, Administration – how to add or remove nodes without affecting clients?

What NOSQL Offers

All being said, NOSQL solutions are not necessarily a replacement for RDBMS, but a complement to handle issues of scalability and complexity. An example usage would be as the Q in CQS…store a master copy in a fully normalized form in RDBMS and then push a de-normalized form into a NOSQL solution for scaling reads. Additionally, by virtue of being schema-less, development is typically easier and faster.

Some NOSQL Databases

The following is an non-exhaustive overview of NOSQL databases:


  • Master/slave replication – master is a SPoF
    • Asynchronous
    • Gives failover and reliability, but not consistency
    • Only master receives writes
  • Document orientated, thus naturally denormalized – stored natively as BSON
  • Flexible schema
  • Programmer friendly
  • Many language drivers – C#, Java, PHP, Ruby, Python et al
  • Atomic on a single document for writes
  • Allows for complex queries – by ranges and multiple criteria for instance
  • Cons
    • Not good for DW/data analytics
    • Blocking offline compaction
    • SPoF – the master dies, everything dies


  • Master/Slave replication
  • Good for real-time stat tracking
  • Very fast – in memory database
  • Volatility – in memory database – potential for data loss
  • Like Memcached, but with data structures: lists, sets, hashtables
  • RAM limitations – whole set fits in memory, but also allows for offline storage
  • Good when the entire dataset can fit in memory


  • Fully distributed – shared nothing – no SPoF
  • Relationships via links
  • Map/Reduce framework
  • Completely schema-less – keys and buckets
  • Scales linearly
  • Tunable consistency  - can adjust for read vs write optimization etc
  • Pre and Post commit hooks
  • Pluggable backend storage
    • Bit cast – everything in memory
    • InnoDb –everything won’t fit in memory
    • Memcached-like in memory
  • Dynamic clustering via “vnodes” similar to Membase/Couchbase vbuckets – when a node is added or removed the data is automatically re-indexed
  • Data is stored unsorted
  • Written in Erlang


  • Has a query language called CQL – SQL like syntax
  • Dynamo based distribution system – BigTable like
  • Allows for range queries, but prone to “hotspots” – uneven distribution of key/value pairs
  • Data center “rack aware”
  • Hadoop integration provided by
  • Configurable caching – like a super-fast Memcached
  • Some schema schematics – hybrid columnar and row based storage system
  • Keeps sort order of data, but can be changed on the fly
  • When growing the cluster “hotspots” may occur – uneven distribution of keys and values


  • Part of the Hadoop suite of tools: HBase, HDFS, Sqoop, Hive, etc
  • Versioned cells – you can query data as it existed at a particular point of time
  • Easy Hadoop integration by default
  • Hadoop NameNode is a SPoF – Secondary NameNode provides some redundancy
  • Schema maintenance requires downtime
  • Complicated balancing – HBase region servers then HDFS

Couchbase – not covered in session

  • Fast, in-memory database due to Memcached interface integration
  • Provides Map/Reduce framework for creating different views of the data you wish to display
  • Stores data as JSON documents via Key/Value pairs
  • Combines the best attributes of Memcached (caching), Membase (administration and scaling) and CouchDb (mapreduce)
  • No SPoF – fully replicated data
  • When a node is added data is automatically rebalanced and replicated across the cluster!
  • Depending upon bucket type, data can be persisted to disk or stored in-memory
  • Can easily support multi-tenancy via buckets – just create a bucket for each client
  • Written in Erlang – newer 2.0 version has more C/C++ for performance reasons
  • Product keeps changing…first it was Membase, then they added Memcached, and now CouchDb functionality – moving target for long-term NOSQL deployment

Next Up: Details…

This is a just a cursory overview of several NOSQL databases, I’ll be evaluating each one in detail in the coming weeks to get a better feel for where each solution fits given a particular scenario. From what I can see, some are more specific in the scope of problem sets that they satisfy, while others are more general purpose tools that satisfy a range of scenarios.

Tags: , ,


Couchbase, UnQL and Linq?

by jmorris 10. January 2012 22:37

I came across the following press release (a bit old) and liked what I read. Specifically, that Couchbase was working on UnQL support with MS:

“Couchbase unveiled and released to the public domain the UnQL query
language, (UNstructured Query Language). Jointly developed with
Microsoft and SQLite, UnQL is designed to provide a common query
language for NoSQL developers and help drive widespread adoption of
NoSQL technology. Each company has committed to delivering product
support for UnQL in 2012.”

By going to UnQL and partnering with MS, this puts Couchbase in an awesome position to develop a Linq (IQueryable) implementation of UnQL. If this happens, then querying a NOSQL or a RDBMS (or anything else) will be unified from the CLR perspective.

For instance, the following Linq query in the CLR (C# syntax):

var query = (from f in Context.Foo
                   where f.Something.Equals(“something”)
                   select new f).

Could emit UnQL if Context is NOSQL or SQL if RDBMS…genius. If only Java had something like IQueryable <sigh>.

It also looks like Couchbase is dumping the CouchDb HTTP REST API for the binary Memcached protocol, which should be a big win from a performance perspective (sorry CouchDb users). Membase already uses the protocol, so it’s just matter of switching the HTTP REST API for UnQL.

Another develpoment in Couchbase is that CouchDb has been forked.  The good news it’s still going to be open-source:

“As J. Chris Anderson notes in the comments, Couchbase is completely open source and Apache licensed:
Everything Couchbase does is open source, we have 2 github pages that are very active:

Probably the most fun place to jump into development is the code review:
Let me clarify, if you like Apache CouchDB, stick with it. I'm working on something I think you'll like a lot better. If not, well, there's still Apache CouchDB.”

While possibly a bit traumatic for CouchDb afficiandos, this should be a huge win for Couchbase fans and for companies investing in Couch as stable, NOSQL solution.



Jeff Morris

Tag cloud

Month List

Page List