Alfresco Community Edition

How does one eat a pizza? Most people take a slice in their hand and start biting from the narrow top. Some use a knife and a fork to neatly cut the pizza slice into thin pieces. There are children who scrape the cheesy bits and eat them with a fork and discard the base totally. There could be numerous other ways of eating the same pizza. If not, this will make a good research topic.

The Alfresco community edition is like that ill-fated pizza slice in my opinion. There are innumerable ways in which people use it. Many use the community edition out of the box. Yet a large section use Alfresco Share or Workdesk as the UI with the out of the box community edition server. There are others who deep dive into the code and make the necessary changes to suit them (with or without contributing the changes back to the community). There are others who build their own applications but use Alfresco as the repository.

Alfresco is the numero-uno open source ECM platform out there. Most customers who think of scaling the ECM tree would have downloaded and played with the Alfresco community edition. We did the same thing long ago and decided to use Alfresco community edition as yet another supported repository for our ECM UI framework product.

Having worked with ECM products such as FileNet, we were always apprehensive of the scalability aspect of Alfresco, the community edition to be precise. We have seen trillions of documents going in and coming back from FileNet repositories seamlessly or thousands of users working with their documents and tasks using FileNet based applications. FileNet anyway runs on high horse power servers in a clustered or farmed environment to scale. On the other hand Alfresco’s hardware resource requirements are minimal. I can easily run Alfresco on my 32-bit laptop. Naturally we sell FileNet based solutions to customers who operate high volumes or have many users. Alfresco community edition based offerings are typically for lower volume/lower user customers.

Recently one of our customers in India brought a performance issue with their Alfresco community edition based installation. They have only less than 10 users but have larger volumes of documents. The customer uses our capture solution as well as the document management application that uses the Alfresco community edition as the repository underneath. The issue was that the system was very slow. At the first instance we felt that we were vindicated in our assumptions that Alfresco community edition cannot scale beyond a point.

A closer look at the issue revealed that there might be a way out. The customer uses our capture product to ingest anywhere between 15000 to 25000 documents a day to the repository. All the documents for a day get into a folder specifically created for that date. Further analysis prompted us to think that too many documents in one folder could be the one hindering the performance. So we changed the capture export configuration to create sub-folders within the day folder and limit the maximum documents per sub-folder to less than 2000. SharePoint used to have a performance issue when the number of documents in a folder exceeded 2K and may be that awareness might have helped in trying something like this out. Anyway the change worked like a charm and the repository sucked in the pending documents in a jiffy.

The customer is using the system well and so we are delighted too. As of now the customer has about 4+ million documents in the repository. The entire ECM infrastructure runs on a single lower-end server. The return on investment on this solution has been tremendous. It might not be a bad idea to get an ECM setup on the Alfresco Community Edition after all!

Leave a comment