I got pretty excited to read about this unified endeavor, and hungrily clicked the CTA to Get 5.0 Alpha.
> Don't see a button? You may have to disable your ad blocker.
What a shitty experience. You're pitching a product that you're clearly intending to sell commercially (which is both fine and dandy), and your first interaction of this brave new World Stack is to tell me I'm browsing wrong! And that's immediately after I click the previous "Get 5.0 Alpha" Call To Action (which was a button that appeared just fine, mind you) and I'm faced with the bitter realization that it's not Get, it's Get In Our CRM.
This kind of thing rubs me the wrong way, hard. I live in the B2B world, I totally get the pitch and the signup, but don't mislead me on the call to action, and don't make me distrust your intentions (or web development skills, for that matter) by telling me I need to open my privacy and performance preserving ad blocking gate and let you come over to my yard just to fill out a damn signup form on your fancy clipboard.
It's inauthentic, and I don't think I'm the only one who feels that way.
I don't care about the authenticity/ickiness/etc. that just strikes me as a terrible interface decision, one that would make me worry they've made other terrible interface decisions where it matters -- in the product itself.
The idea of forcing the user to change a browser setting to get work done is a very "enterprisey" mindset. An old way of thinking that it's the user's fault if they can't use the software.
And another change of underlying stack? To what this time? Kibana 2 was PHP, then (K3) it was static HTML + user-side JavaScript, and Kibana 4 runs on Node.js.
> Success! We've got your information and will send you a note when 5.0 Alpha is available.
With no links to documentation or github or anything. Having me sign up and get told I'll be emailed at some unspecified time when it's available was very disappointing.
Also annoying is that in order to watch a video overview of many of their products (ie. Elasticsearch 2.0, Beats, Watcher, Shield, ES-Hadoop) I have to register (name, email, use case).
Sorry, I'm not going to register just to watch the videos.
This is the latest in a long string of irritations that will have me looking for a replacement logging datastore post-haste. I've already replaced Logstash with Heka; I guess ES is next.
Good luck with that searching, and remember to share with HN.
I find ElasticSearch somewhat brittle (I need to restart it every few weeks or
so, because it stops accepting any data or queries), and I really do want to
replace it, since it's a memory hog (no data, and it already needs 230MB RAM),
but I haven't found any sensible log storage yet. All I got is this document
searching engine.
That was my experience exactly. The value of being able to search through ours logs was immediately apparent, so when it kept tanking every couple weeks it was fairly easy to talk my boss into paying a tiny bit more for Loggly. AFAICT Loggly is the ELK stack with a different theme applied to the UI, and I don't have to worry that it's down when I need it.
The preceding has been an unpaid endorsement for Loggly.
I would prefer much more to have something hosted by me and being open source
at the same time (I'm looking at you, Splunk). I'm desperate enough to write
such thing myself (and I'm already doing some research on indexing
semi-structured data), even if it would turn out not-too-scalable (I only have
~1.5GB/day of uncompressed logs), but I'm scared by the need of tolerance for
abrupt shutdowns and hitting a free disk space limit, i.e., database
recoverability.
Hey, I looked at Splunk in the last few days as well, but it seems really expensive once you want to get beyond 500MB a day. They don't have an OSS "community" version, do they (?) because I totally feel you on that.
Basically I need ELK, but I need it to be a lot more stable and documented better.
I suggest taking a look at the logz.io - it's essentially ELK as a service. We take care of many problems ES has and also have great features extra such as Alerting, Apps and Insights. I'm one of the core developers there so I'm fluent in pricing but last time I checked it's wasn't expensive and definitely worth your time and hosting costs. One the the great things about it is that you can simply migrate your dashboard and just ship to our Url.
230MB of Heap is very small for most ES usages, and is likely the reason for you needing to restart it often. I run some very large clusters and even with older versions of ES have seen them run billions of indexing and query operations without problems for 4-6 months without any restarts.
230MB RSS, and this is not "ES usage", it's just after starting it with no
data whatsoever. I've seen databases which use less memory than that when
working.
> I run some very large clusters [...] without problems for 4-6 months without any restarts.
Well, good for you. Being a sysadmin, I understand very well that my ES server
may be a specimen combining bad kernel compilation, broken JVM version,
unstable ES release, and ground water emanating bad energy, so I don't
complain much. Still, I would be very, very happy to see any alternative for
storing JSONified logs.
Which version? This reason I ask is that 2.x is a lot more stable for me than previous versions. I have to restart much less often and I haven't had ES "forget" my index since upgrading.
/aside: I get rewriting for the n-th time some utility (say a db driver) in language du jour, but (possibly poorly) replicating things like Lucene never made sense. Lucene is a fairly solid tech by all accounts.
So I was interested in finding out more about Beats. From the product page I click 'View More' beside 'Beats Overview & Demo Video'... and get taken to a registration form with all fields mandatory... eh... no... forget it then.
Please take a look at Heka (http://hekad.rtfd.org) - it's a fairly complex tool, but I'm convinced that it's infinitely better than anything Elastic will put out anytime soon for log shipping.
Does anyone have a good description of the ES query DSL? In spite of the time I've spent with it, I'm consistently unable to write a reasonable query without going to google first, even for basics.
I have SEVERAL elasticsearch-hellride.txt files stored in my documents folder with a bunch of example queries, because it's so wordy I can't keep it all in my head. I just refer to those every few months when I'm adding functionality. I wouldn't get caught up with having to google for basics. Find some things that work FOR YOU and save those, with your personal notes added in the right places.
Here is what I can recommend as far as plugins go:
This will give you access to a great query interface. You can use the GUIs here to build a few queries until you get a feel for what the JSON should look like.
How I use elasticsearch is kind of... Well. It's just what I do. We have PDFs that are OCRed and we store the text in the document along with ~50 fields that are entered by humans. It's probably too complex, but it's financial data and people love to be super verbose with their queries. I can't rely on the OCR to be perfect for SEC filings.
If you read the ES guide [1], it does a better job explaining how to use the DSL and gives better examples. I also recommend using a client API [2] if one is available for you. They are much easier to use than curl.
I was excited and hopeful to try the ELK stack a while back, but, like all the other comments here, have decided it was too unreliable and brittle.
I have an open issue on Logstash, spent a fair amount of time detailing it, but have gotten no feedback. And then I realized there's 600+ open issues!
https://github.com/elastic/logstash/issues/4389
I'm considering using Syslog-ng and would love to hear if anyone has comments on that. Based on other comments here, will be checking out Riemann and Fluentd as well.
https://syslog-ng.org/
Fluentd is great, and works very well to inject logs into Elasticsearch. We were using it at OpenDNS. We didn't even try Logstash, as Fluentd did the job.
At OVH, both turned out to be way too slow, so we wrote Flowgger which is heavily used by all our services: https://github.com/jedisct1/flowgger
I'm not an elastic person, but I can shed some light on this: You're holding it wrong. It's not a bug. You have multiple config files in one directory - if you do that, all those files are combined to one, that means that each event gets handed to each of your individual outputs - multiplying the message. See https://www.elastic.co/guide/en/logstash/current/command-lin...
Feel free to hop on the IRC if you have further questions, there's usually somebody qualified to answer.
I appreciate the help, though isn't the point of `/etc//conf.d` directories generally that you have multiple config files? This is a common idiom that other packages handle correctly (differently?).
I have hopped on the logstash IRC at times to ask about some of this, though I guess not this exact item. In fact, there's a different (well-known issue) that the init script for logstash has the config path hard-coded:
https://botbot.me/freenode/logstash/2015-11-17/?msg=54338903...
There's also the problem that logstash (and forwarder) doesn't seem to let me do anything useful with the file names. I could work around that, sure, but it would be nice to have meaningful file names (not the "ls." thing that LS uses). Syslog-ng, for comparison, gives you a lot of control of that.
> isn't the point of `/etc//conf.d` directories generally that you have multiple config files?
Yes certainly. It's totally fine to place multiple config files there, I do as well. I split up my configs in the various outputs, inputs etc. It's just that logstash combines them to a single pipeline and does not run a pipeline per config file. Nginx doesn't run a webserver per config file either :).
It's certainly something that's unexpected and could be much better documented, but alas, I'm just a user :)
(and I do agree, your issue could have been handled much better, especially since it's not actually a bug)
Awesome, thanks for the clarification. This might help (if I ever go back to using logstash at this point!).
Yeah, my point was more that they accepted my issue, but there's no action and there are more than 600+ other open issues. Seems Elastic is too busy branding and pushing breaking changes to their APIs.
I do sincerely appreciate your clarifications and comments on this one, though.
Don't get too excited about Riemann. It's a stream processing engine, not
data transport. It has its uses, but not for shipping logs.
syslog-ng and its competitor, rsyslog, are fine, but they are concentrated
around logs specifically. Fluentd (and logstash) can be used to transport
other data, too, like monitoring or inventory.
So logstash V5, is that a rewrite in go? I say this because it seems all their other tooling is now written in go, and also, logstash agent and server are very resource intensive in jruby.
If theres a v5 alpha, which isnt on github, is it not open source?
Apparently all they've announced is that they'll release all their products in lockstep with a unified version number from now on.
I myself have hoped for a Go rewrite of Logstash for a long time, but there are apparently no plans for this. They are creating lightweight forwarders with their *beats, though. But they are only for forwarding to ElasticSearch, not a general log pipeline processor like Logstash.
FWIW there is a thing that's like Logstash in Go, it's Heka by Mozilla. I am very fond of it, but for some reason not many people seem to be aware of it or deploy it.
They have a couple of outputs from *beats though, not only ElasticSearch. Besides ElasticSearch there are outputs to console, file, Logstash and a deprecated Redis output.
What i didn't like about Kibana 4 was that they tried to force you to use a nodejs server. I wanted to embed Kibana in my webapp, but Elastic are trying to make you use their services and lock them into their platform. No surprise, i guess, but annoying nonetheless.
I'm not exactly fond of the node requirement, but I fail to see how it locks you any more to their stack than before. I see why they needed some sort of app behind it and well, node is probably the first you pick when you're a js developer by trade.
I've been extremely unimpressed with logstash - many of the plugins in the standard repo are poorly maintained; certain plugins misbehaving - can kill the entire logstash process; it's really inexcusably bad for such core-infrastructure software tbh.
For the love of internet-god, please stop your constant moving of stuff around. Now we have new logos.
Last year's fun, I was using logstash-shipper to ship logs. Early on, the package got pulled completely - sucks to be you if it's in your deployment script or in your documentation. Then it had a name change. Then it moved to one domain. Then it moved to another domain. Then it got switched out for Beats.
Not everyone finds setting up and maintaining an ELK stack so fascinating that they want to keep up to date with exactly where everything is this month. While you can do other things with ELK, the primary use-case is logging. Logging is supposed to be reliable and 'just work'. Every time I see the elastic website, something else has changed, and everyone is pushing the new stuff.
ELK is cool and all, but it's frustrating to follow when you just poke your nose in every few months.
Agreed. I stopped using Logstash for about a year and used it for a bit about a month ago. Awful experience. Awful documentation. Deprecated shit everywhere. Inconsistent stackoverflow information and TWO external websites too help make logstash actually functional. Oh, and since Logstash is a Java based application - would it hurt to give some java stacktrace log parsing configs?
Also, their shitty Debian repo management resulted in a bug that caused my company to lose $30,000.
It's not Java, it's Ruby. Which is even worse, because distribution tarball
with logstash weighs ridiculous 71MB (?!?) and requires JVM to run (?!?) (or
at least nobody talks about it being runnable with MRI).
> Also, their shitty Debian repo management resulted in a bug that caused my
company to lose $30,000.
Well, this is not their fault that much. If you had put any thought about
using repositories, you wouldn't use random packages from random sources over
which you have no control and no trust with regard to package retention
policy or packages quality.
Or maybe you would happily install also MongoDB from Mongo's site?
Elasticsearch is Java, and the logstash tarball includes it and kibana from memory, so you can run it as an all-in-one where logstash launches it's own Elasticsearch.
While I don't have the tiniest font in my terminal, I still couldn't read the entire Elasticsearch process line in htop, even when I'd stretched the terminal all the way across three monitors! The middle one was an ultrawide! I really wish Java would stop using arguments instead of storing config somewhere...
> Elasticsearch is Java, and the logstash tarball includes it and kibana from memory,
Not really. It's just logstash, along with some plugins (and what the heck are
Maven bindings doing there?).
ElasticSearch is another 29MB compressed, which is fine for a database-like
thing, and Kibana 4.x takes 30MB compressed (150MB uncompressed, of which not
the biggest part is Node.js copy).
Got me on the ruby part. Meant to say JVM. My point was more to the fact that their site's incredibly short on examples.
FWIW, I inherited a thrice-removed infrastructure in a latency-sensitive environment. It was fun ;)
Logstash was installed on a couple machines the night before I was going to load some configs from my machine. I didn't load anything yet, except that logstash-web was running at 2s uptime. For some reason, (with upstart's help), the logstash installation included a broken webservice with a bad config. It got stuck in a JVM-birthing restart loop that was enough to cause significant impact. It happens.
(Funny enough, same place used mongodb. We ran into an issue once where database cleanups failed once mongodb reached around 50% of the disk. Local copy cleanups are fun.)
My favorite thing about logstash was when the XML parser would choke and logstash would just quietly stop processing events.. I wish it was engineered as well as riemann.
My favourite thing about logstash-forwarder was that it was apparently an actual intentional design decision to shut the service down if none of the log files it was watching moved in 24 hours. I'm still trying to figure that one out - a quiescent service... shuts itself down... by default... to what end? Ended up having to cronjob it to restart every day. If you wanted it not to do that, from memory you had to rebuild it from scratch.
I've never actually come across that in a service before, one that shuts itself down due to inactivity. I mean, they must be out there, but I never would have guessed it'd be intentionally designed into a log shipper.
Their renaming of ElasticSearch to Elastic really irked me. If only for the simple reason that searching for questions related to them on Google is harder now.
Probably. Trademarks are contextual. Two companies can both trademark the word "Apple". One in the context of computers, another in music distribution. So long as they don't get into each other's business, it's OK. Hmm... Yeah, Apple Records wasn't too happy with iTunes.
I maintain some Ansible roles, as well as some well-worn examples of ELK in production... And this marks the fourth time I'll have to basically rework _everything_ due to Elastic changing up the core architecture, naming, logos, etc.
I'm also less-than-thrilled with the performance for larger deployments. Having to babysit my log aggregation infrastructure (average about 50-100 events/sec across a few dozen hosts) is not very thrilling.
Have you had a chance to look at Fluentd? (Disclaimer: I'm a maintainer) Elasticsearch is far the most popular use case for Fluentd (much kudos to both the community and Elastic), and it's been pretty stable for awhile now.
I gave it a try today. Wanted to send process stats to graphite. Installed it on Ubuntu Trusty, installed the graphite plugin and the process watch plugin and... The td-agent couldn't start watch metrics plugin was incompatible with the 0.12 version of fluentd. So I resorted to collectl.
No, I haven't yet, though I have heard the name here and there. I'm not going to be looking at logs again for a few months, but thanks for the pointer.
Then remember about Fluentd. I was using it for filling ElasticSearch with
logs when Kibana was still written in PHP by a separate developer and was
meant as a somewhat specialized frontend for logstash, not a generic one for
ElasticSearch. I'm using it still for shipping logs (again to ElasticSearch)
and metrics collected by monitoring, and I have even built a proof-of-concept
of a stream processing engine for monitoring out of it.
You could go really far with Fluentd using or writing only few plugins.
> Don't see a button? You may have to disable your ad blocker.
What a shitty experience. You're pitching a product that you're clearly intending to sell commercially (which is both fine and dandy), and your first interaction of this brave new World Stack is to tell me I'm browsing wrong! And that's immediately after I click the previous "Get 5.0 Alpha" Call To Action (which was a button that appeared just fine, mind you) and I'm faced with the bitter realization that it's not Get, it's Get In Our CRM.
This kind of thing rubs me the wrong way, hard. I live in the B2B world, I totally get the pitch and the signup, but don't mislead me on the call to action, and don't make me distrust your intentions (or web development skills, for that matter) by telling me I need to open my privacy and performance preserving ad blocking gate and let you come over to my yard just to fill out a damn signup form on your fancy clipboard.
It's inauthentic, and I don't think I'm the only one who feels that way.