Cassandra – How to Handle Large Media Files

Using Apache Cassandra as an highly available  Big Data Platform is in my eyes a good choice as Cassandra Cluster is easy to handle. We are using Apache Cassandra as an archive platform for our human centric worklfow engine Imixs-Workflow. But how does Cassandra perform with large files?

If you start with Cassandra you have to change the way how you use databases, especially when you come from the SQL direction.  Although Cassandra can handle very large amounts of data easily, you have to consider the concept of the partition size. This means in short that the data within a partition (defined by the Partitionkey) should not exceed 100 MB. If you plan to store large files (e.g media files) you need to split up your data into smaller chunks. In the following I will explain in short how this can be done. Continue reading “Cassandra – How to Handle Large Media Files”

MVC 1.0 – Binding Input Values

MVC 1.0 is the new action based web framework for Jakarta EE. One advantage of the new technology is that you can work with plain html. With the expression language (EL) provided by JSP and Facelets you can easily bind values form a CDI controler to an input field:

<dl>
  <dt>Name:</dt>
  <dd>
    <input type="text" value="#{dataController.name}" name="name" />
   </dd>
</dl>

This works also for Textareas:

<dl>
  <dt>Description:</dt>
  <dd>
    <textarea name="descripton">#{dataController.description}</textarea>
  </dd>
</dl>

A little bit more tricky in this szeanrio is the handling of select options (Combobox). As the selected Item is defined by an attribute within the option tag a combination with EL is clumsy. But with the help of jQuery this can also be archived easily. See my following example:

<dl>
  <dt>Sort Order:</dt>
  <dd>
      <select name="sortOrder" id="sortOrder">
        <option value="DESC">Descanding</option>
        <option value="ASC">Ascanding</option>
      </select>
  </dd>
</dl>

<script type="text/javascript">
$(document).ready(function() {
  // update the select option....
  $("#sortOrder").val("#{dataController.sortOrder}");
});
</script>

See also the tutorial “MVC 1.0 in Java EE 8 – Handling Form Submits” to learn how you can submit you form inputs with MVC 1.0.

javax.ws.rs.client.Client and Form Based Authentication

Today I implemented a javax.ws.rs.client.ClientRequestFilter for a form based authentication. The FormAuthenticator class can be used in combination with a javax.ws.rs.client.Client  to interact, for example, with a rest api secured by a login form. Such a login form in Java EE typically uses the request URI ‘/j_security_check‘ with the form input fields ‘j_username’ and ‘j_password‘. As a result of a successful login the browser stores a cookie named “JSESSIONID” which need to be send with every request.

The request filter can be added to a javax.ws.rs.Client like this:

....
// create a javax.ws.rs.client
client = ClientBuilder.newClient();
// create new formAuthenticator
FormAuthenticator formAuthFilter = new FormAuthenticator(rest_api_url, 
     userid, password);
// register the filter...
client.register(formAuthFilter);
// now you can GET, POST, ....
....

You cam find the source code of this filter class on GitHub.

If you have any ideas for improvements your comments are welcome!

Do We Need an Open Protocol for Facebook?

These days we have again the discussion, about what is going on with Facebook. Mark Zuckerberg must explain his business model before a public committee.  Many people are puzzled and wondering what exactly is being done with their personal data. Again and again it is argued that one could not really leave Facebook as long there is no alternative platform. Mark Zuckerberg himself explains to the US Congress that he only want to bring people together. He want to open a way to allow people sharing there thoughts and ideas. OK, this is an honorable goal. But what in basic did we need to achieve such a goal? Continue reading “Do We Need an Open Protocol for Facebook?”

How to use Traefik.io as Static Proxy

Traefik.io is a very cool open source project, providing a powerful reverse proxy. The project is focusing mainly on container based architectures like Docker Swarm. In such an environment Traefik.io is able to recognize new containers in a network and dynamically computes the route from the frontend to the corresponding backend service. I wrote about this functionality in combination with docker swarm already in my blog: Lightweight Docker Swarm Environment. This concept is also part of the Imixs-Workflow project.

But what if you just want to add a kind of static route, which has nothing to do with container based services. I had this situation as I wanted to redirect incoming requests for a specific host name to an external server – outside of my docker swarm.

To realize this, you can add a front-end rule under the section [file] at the end of your traefik.toml file. This is an example how such a rule can looks like:

...
[file]

[backends]
 [backends.backend1]

 [backends.backend1.servers]
   [backends.backend1.servers.server0]
   url = "http://some.host.de:12345"
   # note that you cannot add path in 'url' field
 
[frontends]
  [frontends.frontend1]
  entryPoints = ["http"]
  backend = "backend1"
  passHostHeader = true
  [frontends.frontend1.routes]
    [frontends.frontend1.routes.route0]
    rule = "Host:www.myweb.com"

This rule proxies requests for “www.myweb.com” to the host “some.host.de:12345”. See also the discussion here.

Imixs-Workflow 4.2.6 released

Today I released version 4.2.6 of Imixs-Workflow. The new release is prepared for the Imixs-Archive feature which is the next big thing in Imixs-Workflow. The new version also includes some improvements of the Rest API and several bug fixes. The release notes can be seen on GitHub.

Docker Service to Backup a PostgreSQL / MySQL Database

I have written a docker service to be used for a periodically backup of a PostgreSQL Database. This container can be used to be part of a docker stack in a docker-compose.yml file.

version: '3.1' 
services:
...
backup:
image: imixs/backup
environment:
SETUP_CRON: "0 3 * * *"
BACKUP_DB_TYPE: "POSTGRESQL"
BACKUP_DB_USER: “postgres”
BACKUP_DB_PASSWORD: “xxxxxxxxxx”
BACKUP_DB_HOST: “db”
BACKUP_LOCAL_ROLLING: “5” ....

The service runs a cron job an uploads backup files automatically into a remote backup space via SFTP/SCP.

Backup MySQL

You can also use this Docker Image to backup a MySQL Database. Just change the environment variable ‘BACKUP_DB_TYPE’:

BACKUP_DB_TYPE: "MYSQL"

Of course the service also provides methods to restore the data. The Service is published on GitHub and DockerHub where you will find more details.

Running Hadoop with Docker Containers

If you play around with Apache Hadoop, you can hardly find examples build on Docker. This is because Hadoop is rarely operated via Docker but mostly installed directly on bare metal. Above all, if you want to test built-in tools such as HBase, Spark or Hive, there are only a few Docker images available.

A project which fills this gap comes from the European Union and is named BIG DATA EUROPE. One of the project objectives it to design, realize and evaluate a Big Data Aggregator Platform infrastructure.

The platform is based on Apache Hadoop and competently build on Docker. The project offers basic building blocks to get started with Hadoop and Docker and make integration with other technologies or applications much easier.  With the Docker images provided by this project, a Hadoop platform can be setup on a local development machine, or scale up to hundreds of nodes connected in a Docker Swarm. The project is well documented and all the results of this project are available on GitHub.

For example, to setup a Hadoop HBase local cluster environment takes only a few seconds:

$ git clone https://github.com/big-data-europe/docker-hbase.git
$ cd docker-hbase/
$ docker-compose -f docker-compose-standalone.yml up
Starting datanode
Starting namenode
Starting resourcemanager
Starting hbase
Starting historyserver
Starting nodemanager
Attaching to namenode, resourcemanager, hbase, datanode, nodemanager, historyserver
namenode | Configuring core
resourcemanager | Configuring core
.........
..................