Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
The Elastic Stack: Future of ELK Platform (elastic.co)
108 points by hurrycane on Feb 17, 2016 | hide | past | favorite | 87 comments


I got pretty excited to read about this unified endeavor, and hungrily clicked the CTA to Get 5.0 Alpha.

> Don't see a button? You may have to disable your ad blocker.

What a shitty experience. You're pitching a product that you're clearly intending to sell commercially (which is both fine and dandy), and your first interaction of this brave new World Stack is to tell me I'm browsing wrong! And that's immediately after I click the previous "Get 5.0 Alpha" Call To Action (which was a button that appeared just fine, mind you) and I'm faced with the bitter realization that it's not Get, it's Get In Our CRM.

This kind of thing rubs me the wrong way, hard. I live in the B2B world, I totally get the pitch and the signup, but don't mislead me on the call to action, and don't make me distrust your intentions (or web development skills, for that matter) by telling me I need to open my privacy and performance preserving ad blocking gate and let you come over to my yard just to fill out a damn signup form on your fancy clipboard.

It's inauthentic, and I don't think I'm the only one who feels that way.


I don't care about the authenticity/ickiness/etc. that just strikes me as a terrible interface decision, one that would make me worry they've made other terrible interface decisions where it matters -- in the product itself.

The idea of forcing the user to change a browser setting to get work done is a very "enterprisey" mindset. An old way of thinking that it's the user's fault if they can't use the software.


Kibana 4 interface looks nice first, but actually very confusing, even more confusing than the old interface.


Maybe you'll be happy 5 is another new interface.


And another change of underlying stack? To what this time? Kibana 2 was PHP, then (K3) it was static HTML + user-side JavaScript, and Kibana 4 runs on Node.js.


Not to mention when you fill it out it says:

> Success! We've got your information and will send you a note when 5.0 Alpha is available.

With no links to documentation or github or anything. Having me sign up and get told I'll be emailed at some unspecified time when it's available was very disappointing.


Also annoying is that in order to watch a video overview of many of their products (ie. Elasticsearch 2.0, Beats, Watcher, Shield, ES-Hadoop) I have to register (name, email, use case).

Sorry, I'm not going to register just to watch the videos.


This is the latest in a long string of irritations that will have me looking for a replacement logging datastore post-haste. I've already replaced Logstash with Heka; I guess ES is next.


Good luck with that searching, and remember to share with HN.

I find ElasticSearch somewhat brittle (I need to restart it every few weeks or so, because it stops accepting any data or queries), and I really do want to replace it, since it's a memory hog (no data, and it already needs 230MB RAM), but I haven't found any sensible log storage yet. All I got is this document searching engine.


That was my experience exactly. The value of being able to search through ours logs was immediately apparent, so when it kept tanking every couple weeks it was fairly easy to talk my boss into paying a tiny bit more for Loggly. AFAICT Loggly is the ELK stack with a different theme applied to the UI, and I don't have to worry that it's down when I need it.

The preceding has been an unpaid endorsement for Loggly.


I would prefer much more to have something hosted by me and being open source at the same time (I'm looking at you, Splunk). I'm desperate enough to write such thing myself (and I'm already doing some research on indexing semi-structured data), even if it would turn out not-too-scalable (I only have ~1.5GB/day of uncompressed logs), but I'm scared by the need of tolerance for abrupt shutdowns and hitting a free disk space limit, i.e., database recoverability.


Hey, I looked at Splunk in the last few days as well, but it seems really expensive once you want to get beyond 500MB a day. They don't have an OSS "community" version, do they (?) because I totally feel you on that.

Basically I need ELK, but I need it to be a lot more stable and documented better.


Loggly is a five digit monthly bill at our present logging volume. No thanks.


I suggest taking a look at the logz.io - it's essentially ELK as a service. We take care of many problems ES has and also have great features extra such as Alerting, Apps and Insights. I'm one of the core developers there so I'm fluent in pricing but last time I checked it's wasn't expensive and definitely worth your time and hosting costs. One the the great things about it is that you can simply migrate your dashboard and just ship to our Url.


What are you using? I'd rather run it myself, but like I said ELK was a bit too bleeding edge for me in the stability dept.


Heka for ingestion/parsing/message routing, RabbitMQ as the queuing/delivery mechanism, Elasticsearch as datastore, Grafana for visualization.


230MB of Heap is very small for most ES usages, and is likely the reason for you needing to restart it often. I run some very large clusters and even with older versions of ES have seen them run billions of indexing and query operations without problems for 4-6 months without any restarts.


> 230MB of Heap is very small for most ES usages

230MB RSS, and this is not "ES usage", it's just after starting it with no data whatsoever. I've seen databases which use less memory than that when working.

> I run some very large clusters [...] without problems for 4-6 months without any restarts.

Well, good for you. Being a sysadmin, I understand very well that my ES server may be a specimen combining bad kernel compilation, broken JVM version, unstable ES release, and ground water emanating bad energy, so I don't complain much. Still, I would be very, very happy to see any alternative for storing JSONified logs.


Despite what they recommend, I'd move to G1GC over CMS. Did wonders for our stability. This is on ES2.1.0 & OpenJDK 1.8.0-u80, IIRC.


Which version? This reason I ask is that 2.x is a lot more stable for me than previous versions. I have to restart much less often and I haven't had ES "forget" my index since upgrading.


Some 1.x, and I don't intend to upgrade to 2.x, as it requires Kibana upgrade, and I don't want to deal with 150MB Node.js insanity Kibana 4 brings.


I'm looking at Graylog right now.


Curious if you had a chance to look at Fluentd (disclaimer: I'm one of the maintainers of Fluentd) If you did, I'm interested in your honest feedback.


We already have a transport pipeline; what's needed is a decently scalable, searchable, easy-to-use datastore for short/medium/long-term log analysis.


Not yet, but I'll give it a shot. How's sharding/scalability?


Right? An ES replacement in Go would be so great.


Why do you care of the language?


For one thing it means you wouldn't have to screw around with the peculiarities of the JVM.


Ridiculously easy deployment and lower memory footprint.


Is there already Lucene for Go? ;-)


http://www.blevesearch.com/

It has an very long way ahead to catch up with Lucene, but it's a very promising start.


Naïve question from a bystander: how much better is Lucene compared to bleve ?


/aside: I get rewriting for the n-th time some utility (say a db driver) in language du jour, but (possibly poorly) replicating things like Lucene never made sense. Lucene is a fairly solid tech by all accounts.


So I was interested in finding out more about Beats. From the product page I click 'View More' beside 'Beats Overview & Demo Video'... and get taken to a registration form with all fields mandatory... eh... no... forget it then.

What the hell has happened to Elastic?!


Please take a look at Heka (http://hekad.rtfd.org) - it's a fairly complex tool, but I'm convinced that it's infinitely better than anything Elastic will put out anytime soon for log shipping.


Looks nice! I wasn't aware of Heka either (I've been out of the Elasticsearch world for a year or so), so thanks :)


Does anyone have a good description of the ES query DSL? In spite of the time I've spent with it, I'm consistently unable to write a reasonable query without going to google first, even for basics.


Here is what I have linked on one of my search pages: (for advanced customers to examine)

https://www.elastic.co/guide/en/elasticsearch/reference/1.7/...

I have SEVERAL elasticsearch-hellride.txt files stored in my documents folder with a bunch of example queries, because it's so wordy I can't keep it all in my head. I just refer to those every few months when I'm adding functionality. I wouldn't get caught up with having to google for basics. Find some things that work FOR YOU and save those, with your personal notes added in the right places.

Here is what I can recommend as far as plugins go:

http://www.elastichq.org/ - clean, simple interface, but not quite as powerful as:

KOPF: https://github.com/lmenezes/elasticsearch-kopf

This will give you access to a great query interface. You can use the GUIs here to build a few queries until you get a feel for what the JSON should look like.

How I use elasticsearch is kind of... Well. It's just what I do. We have PDFs that are OCRed and we store the text in the document along with ~50 fields that are entered by humans. It's probably too complex, but it's financial data and people love to be super verbose with their queries. I can't rely on the OCR to be perfect for SEC filings.


Kopf is phenomenal. All the node data and index stats that I want, all super-accessible.


If you read the ES guide [1], it does a better job explaining how to use the DSL and gives better examples. I also recommend using a client API [2] if one is available for you. They are much easier to use than curl.

1. https://www.elastic.co/guide/en/elasticsearch/guide/current/...

2. https://www.elastic.co/guide/en/elasticsearch/client/index.h...


I typically use elasticsearch-py[0] to build and test queries. Deeply nested JSON documents are a headache.

[0] http://elasticsearch-py.readthedocs.org/en/master/


I was excited and hopeful to try the ELK stack a while back, but, like all the other comments here, have decided it was too unreliable and brittle.

I have an open issue on Logstash, spent a fair amount of time detailing it, but have gotten no feedback. And then I realized there's 600+ open issues! https://github.com/elastic/logstash/issues/4389

I'm considering using Syslog-ng and would love to hear if anyone has comments on that. Based on other comments here, will be checking out Riemann and Fluentd as well. https://syslog-ng.org/


Fluentd is great, and works very well to inject logs into Elasticsearch. We were using it at OpenDNS. We didn't even try Logstash, as Fluentd did the job.

At OVH, both turned out to be way too slow, so we wrote Flowgger which is heavily used by all our services: https://github.com/jedisct1/flowgger


I'm not an elastic person, but I can shed some light on this: You're holding it wrong. It's not a bug. You have multiple config files in one directory - if you do that, all those files are combined to one, that means that each event gets handed to each of your individual outputs - multiplying the message. See https://www.elastic.co/guide/en/logstash/current/command-lin...

Feel free to hop on the IRC if you have further questions, there's usually somebody qualified to answer.


I appreciate the help, though isn't the point of `/etc//conf.d` directories generally that you have multiple config files? This is a common idiom that other packages handle correctly (differently?).

I have hopped on the logstash IRC at times to ask about some of this, though I guess not this exact item. In fact, there's a different (well-known issue) that the init script for logstash has the config path hard-coded: https://botbot.me/freenode/logstash/2015-11-17/?msg=54338903...

There's also the problem that logstash (and forwarder) doesn't seem to let me do anything useful with the file names. I could work around that, sure, but it would be nice to have meaningful file names (not the "ls." thing that LS uses). Syslog-ng, for comparison, gives you a lot of control of that.


> isn't the point of `/etc//conf.d` directories generally that you have multiple config files?

Yes certainly. It's totally fine to place multiple config files there, I do as well. I split up my configs in the various outputs, inputs etc. It's just that logstash combines them to a single pipeline and does not run a pipeline per config file. Nginx doesn't run a webserver per config file either :).

It's certainly something that's unexpected and could be much better documented, but alas, I'm just a user :)

(and I do agree, your issue could have been handled much better, especially since it's not actually a bug)


Awesome, thanks for the clarification. This might help (if I ever go back to using logstash at this point!).

Yeah, my point was more that they accepted my issue, but there's no action and there are more than 600+ other open issues. Seems Elastic is too busy branding and pushing breaking changes to their APIs.

I do sincerely appreciate your clarifications and comments on this one, though.


(I've never gotten used to HN's non-Markdown markup.)


Oh, well, interesting that I just notice your HN username is also that IRC username. I suppose you're the one that replied to me there! Thank you!


likely, I tend to use the same name all over the internet :) You're welcome.


Don't get too excited about Riemann. It's a stream processing engine, not data transport. It has its uses, but not for shipping logs.

syslog-ng and its competitor, rsyslog, are fine, but they are concentrated around logs specifically. Fluentd (and logstash) can be used to transport other data, too, like monitoring or inventory.


Good info (from jedisct1 as well) -- the distinctions are indeed important!


So logstash V5, is that a rewrite in go? I say this because it seems all their other tooling is now written in go, and also, logstash agent and server are very resource intensive in jruby. If theres a v5 alpha, which isnt on github, is it not open source?


Apparently all they've announced is that they'll release all their products in lockstep with a unified version number from now on.

I myself have hoped for a Go rewrite of Logstash for a long time, but there are apparently no plans for this. They are creating lightweight forwarders with their *beats, though. But they are only for forwarding to ElasticSearch, not a general log pipeline processor like Logstash.

FWIW there is a thing that's like Logstash in Go, it's Heka by Mozilla. I am very fond of it, but for some reason not many people seem to be aware of it or deploy it.


They have a couple of outputs from *beats though, not only ElasticSearch. Besides ElasticSearch there are outputs to console, file, Logstash and a deprecated Redis output.


I've been beating the drum for Heka ever since I found it.



What i didn't like about Kibana 4 was that they tried to force you to use a nodejs server. I wanted to embed Kibana in my webapp, but Elastic are trying to make you use their services and lock them into their platform. No surprise, i guess, but annoying nonetheless.


I'm not exactly fond of the node requirement, but I fail to see how it locks you any more to their stack than before. I see why they needed some sort of app behind it and well, node is probably the first you pick when you're a js developer by trade.


Has it ever been possible to embed Kibana in a webapp? Excluding iframe hacks ofcourse


Yes, in v3. In v4, we're using this: https://github.com/kibana-community/kibana4-static


Is X-Pack going to be a subscription based service?


Considering all the components currently in it are only available if you're in their subscription plan, that's basically the core business model.


I wonder whether the components will be available a la carte like they are now, or if it's going to be a single bundle.


I've been extremely unimpressed with logstash - many of the plugins in the standard repo are poorly maintained; certain plugins misbehaving - can kill the entire logstash process; it's really inexcusably bad for such core-infrastructure software tbh.


Dear Elastic:

For the love of internet-god, please stop your constant moving of stuff around. Now we have new logos.

Last year's fun, I was using logstash-shipper to ship logs. Early on, the package got pulled completely - sucks to be you if it's in your deployment script or in your documentation. Then it had a name change. Then it moved to one domain. Then it moved to another domain. Then it got switched out for Beats.

Not everyone finds setting up and maintaining an ELK stack so fascinating that they want to keep up to date with exactly where everything is this month. While you can do other things with ELK, the primary use-case is logging. Logging is supposed to be reliable and 'just work'. Every time I see the elastic website, something else has changed, and everyone is pushing the new stuff.

ELK is cool and all, but it's frustrating to follow when you just poke your nose in every few months.

Love,

- Vacri


Agreed. I stopped using Logstash for about a year and used it for a bit about a month ago. Awful experience. Awful documentation. Deprecated shit everywhere. Inconsistent stackoverflow information and TWO external websites too help make logstash actually functional. Oh, and since Logstash is a Java based application - would it hurt to give some java stacktrace log parsing configs?

Also, their shitty Debian repo management resulted in a bug that caused my company to lose $30,000.

The world needs more ELK hate.


It's not Java, it's Ruby. Which is even worse, because distribution tarball with logstash weighs ridiculous 71MB (?!?) and requires JVM to run (?!?) (or at least nobody talks about it being runnable with MRI).

> Also, their shitty Debian repo management resulted in a bug that caused my company to lose $30,000.

Well, this is not their fault that much. If you had put any thought about using repositories, you wouldn't use random packages from random sources over which you have no control and no trust with regard to package retention policy or packages quality.

Or maybe you would happily install also MongoDB from Mongo's site?


Elasticsearch is Java, and the logstash tarball includes it and kibana from memory, so you can run it as an all-in-one where logstash launches it's own Elasticsearch.

While I don't have the tiniest font in my terminal, I still couldn't read the entire Elasticsearch process line in htop, even when I'd stretched the terminal all the way across three monitors! The middle one was an ultrawide! I really wish Java would stop using arguments instead of storing config somewhere...


> Elasticsearch is Java, and the logstash tarball includes it and kibana from memory,

Not really. It's just logstash, along with some plugins (and what the heck are Maven bindings doing there?).

ElasticSearch is another 29MB compressed, which is fine for a database-like thing, and Kibana 4.x takes 30MB compressed (150MB uncompressed, of which not the biggest part is Node.js copy).


Got me on the ruby part. Meant to say JVM. My point was more to the fact that their site's incredibly short on examples.

FWIW, I inherited a thrice-removed infrastructure in a latency-sensitive environment. It was fun ;)

Logstash was installed on a couple machines the night before I was going to load some configs from my machine. I didn't load anything yet, except that logstash-web was running at 2s uptime. For some reason, (with upstart's help), the logstash installation included a broken webservice with a bad config. It got stuck in a JVM-birthing restart loop that was enough to cause significant impact. It happens.

(Funny enough, same place used mongodb. We ran into an issue once where database cleanups failed once mongodb reached around 50% of the disk. Local copy cleanups are fun.)


it's JRuby: best of both worlds LOL


As a replacement for Logstash, check out Flowgger: https://github.com/jedisct1/flowgger and Heka https://github.com/mozilla-services/heka


My favorite thing about logstash was when the XML parser would choke and logstash would just quietly stop processing events.. I wish it was engineered as well as riemann.


My favourite thing about logstash-forwarder was that it was apparently an actual intentional design decision to shut the service down if none of the log files it was watching moved in 24 hours. I'm still trying to figure that one out - a quiescent service... shuts itself down... by default... to what end? Ended up having to cronjob it to restart every day. If you wanted it not to do that, from memory you had to rebuild it from scratch.

I've never actually come across that in a service before, one that shuts itself down due to inactivity. I mean, they must be out there, but I never would have guessed it'd be intentionally designed into a log shipper.


Their renaming of ElasticSearch to Elastic really irked me. If only for the simple reason that searching for questions related to them on Google is harder now.


They could have gone with "Elastico" or something easier to trademark. I think that's why they keep saying "Elastic Stack" instead of just "Elastic".


Can you rally trademark Spanish words?


Probably. Trademarks are contextual. Two companies can both trademark the word "Apple". One in the context of computers, another in music distribution. So long as they don't get into each other's business, it's OK. Hmm... Yeah, Apple Records wasn't too happy with iTunes.


+1

I maintain some Ansible roles, as well as some well-worn examples of ELK in production... And this marks the fourth time I'll have to basically rework _everything_ due to Elastic changing up the core architecture, naming, logos, etc.

I'm also less-than-thrilled with the performance for larger deployments. Having to babysit my log aggregation infrastructure (average about 50-100 events/sec across a few dozen hosts) is not very thrilling.


Have you had a chance to look at Fluentd? (Disclaimer: I'm a maintainer) Elasticsearch is far the most popular use case for Fluentd (much kudos to both the community and Elastic), and it's been pretty stable for awhile now.


I gave it a try today. Wanted to send process stats to graphite. Installed it on Ubuntu Trusty, installed the graphite plugin and the process watch plugin and... The td-agent couldn't start watch metrics plugin was incompatible with the 0.12 version of fluentd. So I resorted to collectl.


Sorry to hear this. Can you give me the exact name of the plugin? I'd make sure to look at it.


No, I haven't yet, though I have heard the name here and there. I'm not going to be looking at logs again for a few months, but thanks for the pointer.


Then remember about Fluentd. I was using it for filling ElasticSearch with logs when Kibana was still written in PHP by a separate developer and was meant as a somewhat specialized frontend for logstash, not a generic one for ElasticSearch. I'm using it still for shipping logs (again to ElasticSearch) and metrics collected by monitoring, and I have even built a proof-of-concept of a stream processing engine for monitoring out of it.

You could go really far with Fluentd using or writing only few plugins.


Does ELK = Elasticsearch + Logstash + Kibana?


Yep. Elasticsearch is often used in other projects, but logstash and kibana are generally used in tandem with each other and Elasticsearch.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: