I do not think there is a simple how to for this. First you need to be clear of 
volumes in storage, in-transit and in-processing. Then you need to be aware of 
what kind of queries you want to do. Your assumption of milliseconds for he 
expected data volumes currently seem to be unrealistic. However you need to 
provide much more information on where the data comes from, what you need to do 
etc.

Another thing: please do not use xml or json for big data. You waste hardware 
resources , time and harm the environment.

> On 19 Apr 2016, at 07:01, Deepak Sharma <deepakmc...@gmail.com> wrote:
> 
> Hi all,
> I am looking for an architecture to ingest 10 mils of messages in the micro 
> batches of seconds.
> If anyone has worked on similar kind of architecture  , can you please point 
> me to any documentation around the same like what should be the architecture 
> , which all components/big data ecosystem tools should i consider etc.
> The messages has to be in xml/json format , a preprocessor engine or message 
> enhancer and then finally a processor.
> I thought about using data cache as well for serving the data 
> The data cache should have the capability to serve the historical  data in 
> milliseconds (may be upto 30 days of data)
> -- 
> Thanks
> Deepak
> www.bigdatabig.com
> 

Reply via email to