coding

GIT – Walkthrough the Integration Manager Workflow

Many people switching from Subversion to GIT wonder about the somewhat different workflow. When it comes to collaboration, Subversion users are used to a centralized workflow where we have a single dedicated central repository. In such a scenario, all developers have read and write access to that one central repository. The main issue with this is that a single user committing  buggy code has a heavy impact on all other developers (Of course, you can always revert, but it has cost us so much time and has happened so often that avoiding this is the much better choice) . Hence, it would be much better, if we could have more than a single central repository, in particular, one central repository per developer – to share every developer’s code – plus one main repository, to which write access is granted to an integration manager only. This will prevent the main repository being smashed by buggy code.

In the GIT universe, the latter scenario is referred to as “Integration-Manager Workflow” and the main repository is named “blessed” repository. We found that this workflow is extremely useful. Still, whenever a new member used to Subversion enters our team, he runs into heavy issues. Apparently, it is easier to start a workflow from scratch than to switch from another workflow.

In this article, we will walk through the Integration-Manager Workflow. As always, this is not a tutorial or even a manual. There are excellent documents on the web that explain GIT concepts in details (e.g. a free eBook “ProGit” and an exhausting tutorial by the GIT team). Here, we assume that you are familiar with GIT, branches and local and remote repositories. Also, we assume that you know how to access local and remote repositories.

In the scenario described here it is assumed that we have a local user developing on his own machine while he has access to a remote workstation hosting GIT repositories available to the rest of the team members. Such a scenario is shown in Fig. 1. Since the remote workstation is located in the cloud, we heavily recommend using ssh for communication. As show, the developer has only read access to the blessed repository, while full access is granted to him to his own repository:

GIT - Infrastructure diagram of the integration manager workflow

Fig. 1: Infrastructure diagram of the integration manager workflow

This walkthrough starts at the point, where you have cloned your own remote repository (or created a remote repository from your own). Fig. 2 shows the typical use case where the local developer fulfills two main task: 1. Get changes from the main (blessed) remote repository and 2. Send his changes to his own remote repository to make them available to other developers and, in particular, to the integration manager:

GIT - Integration Manager Use case

Fig. 2: GIT Integration Manager Use case

The integration manager’s workflow is not part of this article. Here, we focus on the local developer’s view and, thus, on the two use cases 1. “Get changes from main remote” (i.e. from the blessed repository) and 2. “Send changes to own remote”.

First, let’s have a look at the shorter “Send changes to own remote” use case. This is, where the user has carried out some changes, preferably in some non-master branches, merged them successfully into his master branch and now wants to share his success with other users. A combination the the two git commands commit and push will do it:

git commit -a -m "What I have done"
git push ssh://user@remote.srv/remoterepo.git
GIT - BPMN notation showing how to push to a remote repository

Fig. 3: GIT - BPMN notation showing how to push to a remote repository

However, in order to get changes from the blessed repository, it turned out that, for the secure daily work, changes should not be pulled to the master branch. Rather than that, a temporary branch should be created and switched to before pulling.

git remote add remoterepo ssh://user@remote.srv/remoterepo.git
git branch temp
git checkout temp
git pull ssh://user@remote.srv/remoterepo.git
git checkout master

Remember that the first command is required only once. Many developers new to this workflow have expressed their lack of understanding what’s behind those commands. Hopefully, the process shown in Fig. 4 may help:

GIT - BPMN notation showing how to pull from a blessed repository

Fig. 4: GIT - BPMN notation showing how to pull from a blessed repository

And that’s it, that’s all the magic. Using these two processes, we have totally decoupled developer’s from one another, i.e. the insufficiency of one does no harm to the others.

There are, of course, other ways to carry out these use cases, for example git fetch with a subsequent git merge into the master branch. For us, the git fetch workflow turned out to be more time consuming than a trial-and-error pull into a temporary branch, since in most cases, there is no need to review changes. In case there should be a need to do so, pulling into a temporary branch will reveal this necessity.

Advertisements
Standard
coding, security

OAuth and Spring Security

While Spring Framework – and in particular Spring Security – provides many ways to deal with authentication and authorization, there are some new approaches that are becoming increasingly popular. As in most cases, the requirements of the cloud along with that of mulitenant applications have been the driving force of this evolution.

For example, while SSL is definitely a secure layer, many (if not most) home grown applications simply do not want to pay for a signed certificate (even less in face of the fact that such a certificate is mostly restricted to a single domain) and, thus, simply do not use SSL. Of course, this is worst, but still common practice.

Also, basic authentification is, presumingly with respect to its simplicity, still widespread. However, basic authentification in combination with a non-secure transport layer is a security black hole.

And finally, with SaaS rising, comfortable single sign-on in a multitenant environment must be considered in modern software design. Latter issue, cloud based single sign-on, is supported by the Spring Security (see ref. manual).

There are some more requirements emerging from the evolving cloud: for example, RESTful web services as part of asynchronous data transfer need to be secured on user level. While such scenarios can definitely be realized with the Spring Security, it definitely focuses on securing classic synchronous data transfer and flows (for instance, I would rather not choose Spring Web Flow to implement the flow of an highly AJAXian web application). In order to implement flows based on highly asynchronous communication between the client and server, there is definitely some hacking to do (see e.g. this tutorial).

However, the most important issue in times of millions of home grown smartphone apps is sharing credentials with such a client. Using same password over and over again, because you cannot remember millions of different passwords? Of course, this is worst, but still common practice. Now, what if your credentials are misused by your app? Wouldn’t it be much better, if you would not share your credentials with the client at all?

OAuth is an open protocol to allow secure API authorization in a simple and standard method from desktop and web applications. It has been developed with most of the issues that emerged from cloud-based authentification and authorizastion in mind. Furthermore, it is being used by global players like Digg, Jaiku, Flickr, Twitter, and developers of OAuth are hopeful to see Google, Yahoo, and others soon to follow. With so many heavy weight service providers relying on OAuth, it may definitely be considered as quasi standard. Unfortunately, Spring Security is not (yet) shipped with out-of-the-box support of OAuth, although I am pretty sure that the extremely capable Springsource team will deliver Spring with OAuth support as soon as possible.

For those who cannot (and should not) wait, a promising approach is described in OAuth for Spring Security. The purpose of this project is to provide an OAuth implementation for Spring Security. I have tested their implementation, and hereby I strongly recommend their approach. For me, the most striking advantage of this design is the combined power of both, OAuth and Spring Security. While you still have the comfort of Spring and its AOP framework, you have implemented the security of OAuth.

Standard
coding, howto

How entity associations/relationships are mapped by an ORM

To demonstrate how mapping is carried out by ORM, Hibernate was used with JPA2 annotation syntax and MySQL as database. Two trivial entities are used, “Book” and “Store” for both, one-to-many/many-to-one and many-to-many demo. However, this is no particularly good design, since an individual book cannot be physically located in two places, but for this demo, it is an appropiate abstraction.

A book is considered as being owned by one (many) store(s).

1. One-To-Many/Many-To-One

1.1 Unidirectional

1.1.1 Many-To-One

First, let’s have a look at an excerpt the “Book” class. The association has got to be defined here, because many using unidirectional association of the many-to-one type, this one is the “many” side:

@Entity
public class Book extends BusinessObject {
    // Unidirectional Many-to-one
    // No assoc. in Book required
    // getters/setters required
    @ManyToOne
    @JoinColumn(name = "store_fk")
    private Store store;
...
}

Since this is an unidirectional association coming from the Book side, the Store (one) side needs no further association.

@Entity@Table(name = "store")
public class Store extends BusinessObject {
    // No assoc. required
...
}

Running this will result into two tables created, one per entity (ignoring the obligatory and hibernate specific “hibernate_sequences” table). “store_fk” as defined by @JoinColumn of the Book class is mapped as an attribute of the table “Book”.

mysql> show tables from dbjava;
+---------------------+
| Tables_in_dbjava    |
+---------------------+
| book                |
| hibernate_sequences |
| store               |
+---------------------+
3 rows in set (0.00 sec)

mysql> describe `dbjava`.BOOK;
+-------------+----------+------+-----+---------+-------+
| Field       | Type     | Null | Key | Default | Extra |
+-------------+----------+------+-----+---------+-------+
| id          | int(11)  | NO   | PRI | NULL    |       |
| store_fk    | int(11)  | YES  | MUL | NULL    |       |
+-------------+----------+------+-----+---------+-------+
6 rows in set (0.00 sec)

mysql> describe `dbjava`.STORE;
+----------+--------------+------+-----+---------+-------+
| Field    | Type         | Null | Key | Default | Extra |
+----------+--------------+------+-----+---------+-------+
| id_store | int(11)      | NO   | PRI | NULL    |       |
+----------+--------------+------+-----+---------+-------+
2 rows in set (0.00 sec)
1.1.2 One-To-Many

In this case, the association is defined from the inverse point of view, leaving Book without an association to be declared, while the association is now declared by the Store entity.

@Entity
public class Book extends BusinessObject {
    // No assoc. required
}
@Entity
public class Store extends BusinessObject {

    // Unidirectional One-To-Many
    // No assoc. in Book required
    // getters/setters required
    @OneToMany(cascade = CascadeType.ALL, fetch = FetchType.EAGER)
    @JoinColumn(name = "store_fk")
    private Set<Book> books;

    public Set<Book> getBooks() {
        return books;
    }

    public void setBooks(Set<Book> books) {
        this.books = books;
    }
    ...
}

And the resulting database which, using these queries, looks identically:

mysql> show tables from dbjava;
+---------------------+
| Tables_in_dbjava    |
+---------------------+
| book                |
| hibernate_sequences |
| store               |
+---------------------+
3 rows in set (0.00 sec)

mysql> describe `dbjava`.BOOK;
+-------------+----------+------+-----+---------+-------+
| Field       | Type     | Null | Key | Default | Extra |
+-------------+----------+------+-----+---------+-------+
| id          | int(11)  | NO   | PRI | NULL    |       |
| store_fk    | int(11)  | YES  | MUL | NULL    |       |
+-------------+----------+------+-----+---------+-------+
6 rows in set (0.00 sec)

mysql> describe `dbjava`.STORE;
+----------+--------------+------+-----+---------+-------+
| Field    | Type         | Null | Key | Default | Extra |
+----------+--------------+------+-----+---------+-------+
| id_store | int(11)      | NO   | PRI | NULL    |       |
+----------+--------------+------+-----+---------+-------+
2 rows in set (0.00 sec)

1.2 Bidirectional with the “many” side as the owner

In order to define a bidirectional association between both entities, one side is considered to be the owner. In this case, it is the Store that owns a Book. This fact is realized by the “mappedBy” parameter on the owner side, while the owned side carries the Owner’s id within the database.

@Entity
public class Store extends BusinessObject {

    // Bidirectional One-To-Many
    // This side is owner, thus, this collection is mappedBy
    // getters/setters required
    @OneToMany(mappedBy = "store")
    private Set<Book> books;

    public Set<Book> getBooks() {
        return books;
    }

    public void setBooks(Set<Book> books) {
        this.books = books;
    }
...
}
@Entity
public class Book extends BusinessObject {

    // Bidirectional Many-To-one
    // Book is owned by one store
    // getters/setters required
    @ManyToOne
    @JoinColumn(name="store_fk")
    private Store store;

    public Store getStore() {
        return store;
    }

    public void setStore(Store store) {
        this.store = store;
    }
...
}
mysql> show tables from dbjava;
+---------------------+
| Tables_in_dbjava    |
+---------------------+
| book                |
| hibernate_sequences |
| store               |
+---------------------+
3 rows in set (0.00 sec)

mysql> describe `dbjava`.BOOK;
+-------------+----------+------+-----+---------+-------+
| Field       | Type     | Null | Key | Default | Extra |
+-------------+----------+------+-----+---------+-------+
| id          | int(11)  | NO   | PRI | NULL    |       |
| store_fk    | int(11)  | YES  | MUL | NULL    |       |
+-------------+----------+------+-----+---------+-------+
6 rows in set (0.01 sec)

mysql> describe `dbjava`.STORE;
+----------+--------------+------+-----+---------+-------+
| Field    | Type         | Null | Key | Default | Extra |
+----------+--------------+------+-----+---------+-------+
| id_store | int(11)      | NO   | PRI | NULL    |       |
+----------+--------------+------+-----+---------+-------+
2 rows in set (0.00 sec)

2. Many-to-Many

Many-to-many relations typically need another table that provides the association of the associated entities. Both classes need to define the association, however, since one side is regarded as the owner, the definition is asymetric.

@Entity
public class Book extends BusinessObject {

    // Many to many, the other side is the owned
    // Getters & setters are required
    @ManyToMany(
            cascade = {CascadeType.PERSIST, CascadeType.MERGE},
            mappedBy = "books",
            targetEntity = Store.class
    )
    private Collection stores;

    public Collection getStores() {
        return stores;
    }

    public void setStores(Collection stores) {
        this.stores = stores;
    }
@Entity
@Table(name = "store")
public class Store extends BusinessObject {
    @ManyToMany(
            targetEntity = de.tayefeh.businessobjects.Book.class,
            cascade = {CascadeType.PERSIST, CascadeType.MERGE})
    @JoinTable(
            name="store_book",
            joinColumns = @JoinColumn(name = "store_id"),
            inverseJoinColumns = @JoinColumn(name = "book_id")
    )
    private Collection books;

    public Collection getBooks() {
        return books;
    }

    public void setBooks(Collection books) {
        this.books = books;
    }
...
}

In this case, the “store_book” table is actually created and we have three entity tables for two entities. However, there are no additional columns added to the tables of the original entities:

mysql> show tables from dbjava;
+---------------------+
| Tables_in_dbjava    |
+---------------------+
| book                |
| hibernate_sequences |
| store               |
| store_book          |
+---------------------+
4 rows in set (0.00 sec)

mysql> describe `dbjava`.BOOK;
+-------------+----------+------+-----+---------+-------+
| Field       | Type     | Null | Key | Default | Extra |
+-------------+----------+------+-----+---------+-------+
| id          | int(11)  | NO   | PRI | NULL    |       |
+-------------+----------+------+-----+---------+-------+
5 rows in set (0.00 sec)

mysql> describe `dbjava`.STORE;
+----------+--------------+------+-----+---------+-------+
| Field    | Type         | Null | Key | Default | Extra |
+----------+--------------+------+-----+---------+-------+
| id_store | int(11)      | NO   | PRI | NULL    |       |
+----------+--------------+------+-----+---------+-------+
2 rows in set (0.01 sec)

mysql> describe `dbjava`.STORE_BOOK;
+----------+---------+------+-----+---------+-------+
| Field    | Type    | Null | Key | Default | Extra |
+----------+---------+------+-----+---------+-------+
| store_id | int(11) | NO   | MUL | NULL    |       |
| book_id  | int(11) | NO   | MUL | NULL    |       |
+----------+---------+------+-----+---------+-------+
2 rows in set (0.01 sec)
Standard
coding, howto

JUnit4 and Maven – minimal example

When I migrated an old project from JUnit3 to JUnit4, I ran into some problems. mvn:test produced an error:

junit.framework.AssertionFailedError: No tests found in minimal.DoSomeActionTest

>The test classes were no longer available. I found that I had to remove inheritance from :TestCase() and annotate all test methods with @Test. Here is a trivial example:

package minimal;
import org.junit.Assert;
import org.junit.Test;
public class DoSomeActionTest {
    @Test
    public void testIsThisReallyTrue() {
        Assert.assertTrue(true);
    }
}

In case you have your project managed by maven, remember to make use of the maven-compiler-plugin to enforce Java 1.6 (required for annotations):

            <plugin>
                <groupId>
                  org.apache.maven.plugins
                </groupId>
                <artifactId>
                  maven-compiler-plugin
                </artifactId>
                <version>2.3.1</version>
                <configuration>
                    <source>1.6</source>
                    <target>1.6</target>
                    <encoding>UTF-8</encoding>
                </configuration>
            </plugin>

When running mvn:test you should get following positive message:

-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Running minimal.DoSomeActionTest

Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.047 sec

Results :

Tests run: 1, Failures: 0, Errors: 0, Skipped: 0
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESSFUL
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 2 seconds
[INFO] Finished at: Sun Jul 11 17:22:44 CEST 2010
[INFO] Final Memory: 10M/19M
[INFO] ------------------------------------------------------------------------

I have prepared a working maven project. It was published at github. Get your local copy by cloning with git:

git clone git://github.com/sastay/JUnit4-and-Maven-Example.git
Standard
coding

NoSQL Database Paradigm Shift

If only two years ago somebody had dared to suggest that RDBMS was not the first choice and the future of data modeling, nobody would had believed him. However, there have been a lot of talk about non-relational (“NoSQL“) databases emerging (refer to 1, 2, 3, 4). Evidence for their success: Facebook (refer to 5), Twitter, Google, and Amazon heavily prefer NoSQL solutions.

Rob Conery recently wrote a bold and astonishing statement:

“… If you need a join, you’re doing it wrong – default to denormalization.”

“Default to denormalization”. Wow! So here we are now: The times of “do normalize, do normalize, just do it” are over! Normalization alway is a performance issue and for simple data models there is actually no need for maximizing integrity by normalizing. As far as I am concerned, the strongest pros for non-relational NoSQL dbs are

  • scalability,
  • performance and
  • simplicity.

There have been a lot of reports saying that migrating from relational to non-relational models have saved them a lot of time and money (refer to e.g. 6). I must agree: If you ever had to go through the hell of coping with the object-relational impendance mismatch and the Anemic Domain Model dilemma you would give anything to simplify your persistence layer. So many hours (months, years, decades) have been spend by developers dealing with these issues instead of focusing on the business logic.

However, when it comes to complex data models and where data integrity is more important than performance, I still believe in the power of RDBMS and their low-level consistance enforcement mechanisms. Since mixed solutions are encouraged too (refer to 6, 7), I guess an intelligent trade-off is the best solution: Using both data models alongside brings you the best from both worlds.

I am quite convinced that more and more project leaders will (have to) prefer NoSQL dbs wherever possible and carry on using RDBSM wisely wherever data integrity and complexity is an issue.

—- EDIT 08. Oct. 2010 —-

I already expected a quite controversial discussion, however, one important point needs to be added: Most NoSQL implementations are still in a very pre-mature state. Having played around with Java and Cassandra, I must say that integration turned out to be so hard (and buggy) that I just stopped wasting my time and returned to good old Hibernate/JPA/PostgreSQL.

In particular, I would like to mention Jonathans comment: “you wrote this post only a day after Foursquare published their ‘post-mortem’ after crashing with MongoDB and Digg seriously losing its users and firing their CTO for choosing NoSQL architecture.”

So with respect to my own experience and that of some other guys I should add: NoSQL is a great idea, but using it in production is imo a bad idea, because most nosql dbms are too pre-mature.

Nevertheless, I still like the idea and I am looking forward for the time when we will have some nosql dbms that prove to be stable enough for production. 

Standard
coding

Android NDK – Google’s Native Development Kit for Android

I’m getting impressed by Android development more and more. Beside an obligatory SDK, Google offers a so-called “Native Development Kit” (Android NDK). Well, what is it good for? Actually, it compiles C and C++ sources to native binaries. Right! You can’t generate much faster code than this way.

So Android really provides the luxery of the Java language (all along with its huge amount of frameworks and support) as well as a c/c++ compiler (namely: gcc) for performance critical tasks.

Even more. With Rev 3, Android NDK now has  support added for the OpenGL ES 2.0 native library, which must be the best thing that possibly could happen to all 3D geeks out there.

As far as I am concerned, I already have compiled some of my ancient, performance critical math-routines, created a nice Android GUI and now I am using them from my cute little cell phone. Really very, very nice.

Standard
coding, security

Preventing SSH Brute Force Attacks

I’ve been looking for a way to prevent ssh brute force attacks. Although they are not particularly dangerous if you have prohibited password login (which you should have done under any circumstances), they had been spamming my log files. Asking the almighty search engine for relief, I found a number of interesting articles about attack blocker, such as DenyHost.

I’ve just installed the package on my private OsX server via MacPorts. However, it took me a while until I found the installation location of all required files. After having touched /etc/hosts.deny (the file used by denyhosts to store suspicious ips for tcp_wrappers to block them), copied /opt/local/share/denyhosts/denyhosts.cfg-dist to somewhere reasonable (e.g. /etc/denyhosts.cfg), modified it to my needs (added E-Mail etc.), I was able to test start DenyHost with:

sudo /opt/local/Library/Frameworks/Python.framework/Versions/2.6/bin/denyhosts.py --config=/etc/denyhosts.cfg

I’ve got a nice email telling me that, deducing from my /var/log/secure.log some IPs were now added to hosts.deny. Furthermore, some interesting data have been stored in /opt/local/share/denyhosts/data.

However, I prefer DenyHost to be running in daemon mode and to synchronize with data collected from the cloud, so I inserted  SYNC_SERVER = http://xmlrpc.denyhosts.net:9911 into denyhosts.cfg and started DenyHost with some additional options:

sudo /opt/local/Library/Frameworks/Python.framework/Versions/2.6/bin/denyhosts.py --config=/etc/denyhosts.cfg --sync --daemon

And now I feel much more comfortable now.

Related Links:

Standard