Bill Gates described it as his biggest product regret (2).
I remember I thought it was brilliant. Too bad it was probably a little bit too futuristic for its time, as for a few other things they launched when it just was not the right time... the clunky Tablet PCs (3) were for sure another example.
To clarify the comment re: Bill Gates biggest regret, his biggest regret according to the referenced article was that Microsoft never shipped WinFS. He did not regret the product itself. It was unclear to me what the parent meant, as my first question was, "If he regretted it, did he allude to reasons why it's a bad idea?" But that question no longer makes sense when you realize that the Mr Gates ostensibly still believes in the idea.
I remember reading this from Bill Gates on his reddit AMA:
> We had a rich database as the client/cloud store that was part of a Windows release that was before its time. This is an idea that will remerge since your cloud store will be rich with schema rather than just a bunch of files and the client will be a partial replica of it with rich schema understanding.
Then he confirms few comment later that he was talking about WinFS.
While WinFS was a good start, I think the idea of a semantic file system could be extended much further to the whole system (if it weren't for pesky POSIX)
I think most people would expect
/home/geokon/program1/src/
and
/src/program1/home/geokon
to have pretty much the same content
A tag based file system that makes the two equivalent would eliminate all sorta of annoyances where you can't decide how to structure your file hierarchy
(should you have bin/program1 bin/program2 sr/program1 src/program2 or program1/src program1/bin program2/src program2/bin? both layouts have their advantages).
Something like a "path/path/bin/path/path/bin" wouldn't work.. but it's hard to find a case where you really need it. And the vast majority of time the subfolders aren't strict subsets of the parent (like mammals/dogs mammals/cats mammals/whales - where dogs/mammals would be a little weird)
Years ago, I had a similar idea, but never did anything with it. Look at DNs in LDAP (and X.400/X.500), they are based on attribute=value pairs. What about a filesystem in which filenames were collections of attribute=value pairs?
e.g. /home/geokon/program1/src/foo.c
could become:
user=geokon/program=program1/category=src/lang=c/name=foo
You could potentially decide that the order of the attributes is not significant, only the set of attribute-value pairs.
Downside: Too much typing. Although, maybe you could allow standard aliases for attribute names, so that:
(I think the attributes should be first-class filesystem objects, just like files are, as opposed to just text strings. Some of the values, e.g. an enumerated value like lang=c, should be first-class objects as well.)
Just write geokon program1 src c foo, and then just display a list of files that match.
You could for example also make fake folder/menus to navigate tags. (Which would just be appending filters.)
There's a project that kinda wants to do that, ie provide a virtual interface on top of the existing filesystem based on tags you give each file: https://tmsu.org/
BTRFS has had a bunch of problems trying to actually compete with traditional filesystems, though. In the distributed world, CalvinFS seems pretty promising to me.
In some ways, Microsoft is on the way there with the way PowerShell works, and the ability to script things through OS functions that return objects which can be queried. If we ever see a WinFS, it would be very powerful combined with PowerShell
I've been using this for years and it is LIGHTNING fast.
No need to "index" all the files because it reads directly from the MFT. If you create a new file matching the search pattern it's already sitting in the results window by the time you alt-tab.
Also, the "directory size" equivalent of Everything is WizTree [1] ... much faster than WinDirStat, which I see recommended way too often.
Are you sure it doesn't index anything? There is even a section in the settings called Indexes.
Also when you launch it first, it's going to be empty and says it's scanning your folders and it takes a little bit until you see something.
I think it is still indexing (maybe using the MFT instead of recursively listing files and directories), it's just a lot better than Windows search indexing. And it might use this [1] to keep up to date? It's mentioned in the settings
No I'm not sure. Maybe it builds a rudimentary index... but give it a shot yourself and see. 15 seconds after installing, you can search your entire system instantly. It's crazy whatever it is doing.
And yes I do believe it uses the USN Journal to stay up to date.
Thanks for WizTree, you're right that it's an order of magnitude faster than WinDirStat. Only thing it's missing is that graphical block view, but for my usecase it will be perfect.
Everything is pretty awesome, one of the things I really miss on Linux.
When I last looked for a Linux replacement for it, none had the real time updates or the instant search ui, and even those that claimed to index the file system for a quick search were very slow. I actually ended up writing my own hacky and rudimentary GUI over locate to achieve something that fits my needs (and of course doesn't support real time updates).
Maybe things changed since then? Any chance that the HN crowd knows of a good Linux replacement for Everything?
I don't know their reasons for WinFS, but I do know that Everything's authors figured out a long time ago how to get great search with Windows' current FS.
Everything does not index the content of your files, only the name and some attributes.
Also MS already index file contents by default, it just sucks at it. There have been several occasions where I use the find file by name syntax and Windows can't even find the file in the current folder.
This is pretty amazing. I've been using a Windows program called Agent Ransack [0] for finding files with regex, but this Everything program is so much faster. Incredibly so. Thanks for the tip!
Everything is one of the first tools I install on every windows box. For me personally it is a must have. I don't think I know of any other tool which altered my workflow that heavily.
I couldn't find anything on the FAQ and I remember a similar tool posted here on HN and people complained that it called home for some search functionality.
Thanks for sharing! I've installed this locally and I'm really impressed by how easy to use and powerful this is! I must have missed the previous mentions on HN [0],[1].
yeah; wmi is incredibly powerful. before osquery linux and os x had nothing like it. it even has performance counters (albeit at slower intervals than win etl) at the ready.
On macOS, there is a query syntax [0] that's usable in Spotlight and the mdfind(1) command. Richer searchable attributes [1], but the results may have to be piped through other tools for formatting or other output.
I think SQL is too verbose for use on the terminal. find + grep does the trick with way less verbose syntax (but also probably less readable). With that said, it is quite cool.
I wanted to write the exact opposite: a Mysql/Postgres client as a FUSE filesystem driver. Namespaces -> folders, tables -> (editable) CSV files, stored procedures and settings accessible as (editable) plain text files.
If someone put data in a column that wasn't valid, like a string in a bigint column, would the table be altered or would the FUSE driver refuse to make the change?
Even this can be made safe(r) if you only only only connect to your database nthrough a proxy that sanitizes queries. IIRC vitess adds an implicit LIMIT 10 to queries that don't have a limit.
Seems if the query is always going to start with SELECT, that maybe it should be assumed?
I would never use this though, ack or find seem sufficient to me.
It's a shame Bash used 'select' as an elaborate menu built-in - it'd be quite neat to name the binary that (and drop the quotes). The you could just type the query right into your prompt!
Just use the fish shell instead, and you can avoid the years of shell cruft of bash, or the endless customization of zsh. Bash is a good environment for shell scripting, but not really the best for user interaction. Although perl is probably the best environment for shell scripting.
Or just ... Why even bother naming fields. Just make <space> enter return everything from anywhere from all time for all people on all platforms.
How many times are we going to have this ridiculous suggestion that less characters/words is automatically better.
This is 'short/arrow functions' (pick your language) all over again, and invariably ends up with the situation where the new syntax is just fucking impossible to read at first glance, because it has so many variances.
Parens are optional. Unless you have no arguments, then they're required. Curly braces are optional, unless you want more than a simple expression, or no return or to return an object literal, then they're required.
I grow weary enough of this bullshit notion that code must 'look pretty', but when you're using "less characters is always best" as the definition for 'pretty' it just becomes unbearable.
It could be extended with other queries like “update” for batch file operations, I suppose. Like you, I’m fine with my existing suite of find/grep/ack/awk/perl/whatever, but I already know how to use them; a beginner or someone who doesn’t live their whole life at a Unix terminal could probably benefit from the simpler interface.
Sure; in general, a file system endowed with extended/extensible attributes can be naturally seen as a relational database (in which the files themselves are BLOBs).
find/grep/awk/ag get me a long way to be honest. However, I think this is a cool project because it makes filtering of file attributes (such as size) so much easier. No need for splitting strings and using regex. Cheers.
Not to take anything away from this project, but you can filter on size, permission, etc easily and robustly using just `find` (try -perm, -size, -{c,m}time, etc flags): https://linux.die.net/man/1/find
If you are splitting strings (from output of `ls -l` presumably) for such tasks, then definitely take a look at find.
Nice project, wish you the best ! Although tbh, I personally won't use this simply because I know enough of find(1) to not see the cognitive overhead of switching to sql to do filesystem /queries/.
Any examples where this would be better than using find (with the occasional filter thrown in) ?
"select name from foo where name not in (select name from ../bar where date < ...)"
I'm usually fine with `find`, but when doing things more interesting than just "find files in this directory that are not in that directory", while uncommon, tend to make me think about my pipeline a bit.
Out of curiosity, I'm interested in how folks would do the "in this not that" folder query. At a gut shot, I'd assume that diff would be used. I'm about to dig through the find man page to see if it has something directly to help.
One tool for this is the 'comm' utility: given two files containing sorted lines, it can output one or more of (1) lines only in file 1, (2) lines only in file 2, and (3) lines common to both files.
It sure did. Alas at the time I had a BeBox in college (mid-late 90s) I didn't know SQL yet ;) I think it just searched over file metadata, not contents, though I might be mistaken.
Unfortunately, it was commercial development, so it's not released. But implementation is relatively easy - it was a sqlite virtual table that (as much as I remember) looked in where condition for dir field, and listed that directory (= returned stat() data).
Whole thing was quite interesting, because almost every component was somehow hooked into sqlite (either vith function or virtual table), so one could do pretty interesting things only with SQL.
Nice! I'm actually working on a similar project to push lsof and files from /proc into some postgres tables. Lets me do cool things like query log files across a ~6000 server infrastructure similar to:
SELECT distinct(l.name)
FROM lsof l, lsofer_runs r
WHERE l.lsofer_id = r.id
AND fd_type = 'REG'
AND l.fd ~ '[0-9][uw]'
AND l.name like '%log'
GROUP BY l.name, r.hostname
ORDER BY name
I'm specifically writing it to find any log file that isn't being pushed into our third party logging service. It's a surprisingly difficult problem, especially considering the amount of tech sprawl that's accumulated. Since it's also a relatively low latency environment, it has to be written in a way that doesn't add too much load (without core isolation..).
Definitely crossed my mind, but I'm working on hosts where installing auditd isn't really easy. Broken yum and apt all over the place makes installing new packages almost impossible. Same goes for lsof, but its installed in "enough" places. Kinda nightmarish, but it gives me a chance to write some fun code ;).
Also, thanks for the article! Super interesting. Think that'd be better than implementing something on top of sysdig?
I wasn't aware of the polling limitations of sysdig, but it definitely explains some things I've seen in the past. This is definitely going in my toolkit. Cheers!
Is this something you could do with Presto? You'd need to write a custom connector and it doesn't look like there is support for dynamically adding/removing catalogs (https://github.com/prestodb/presto/issues/2445) but it would presumably handle the heavy lifting for you.
This is nice, but what I'm actually looking for is a lightweight clone of SharePoint Search[1].
Something that has a self-hosted Web Interface, and an engine that I can point at some file servers, and let it index the files to it's hearts content. All I then have to do is search the index 'google style' for my files.
This is really neat. MacOS has a easy to use smart folder which I use to find recent files and large files. An interface like this is an advantage because it's easy to understand what it's doing and it's cross platform . Other people make the claim it may be verbose (but being verbose makes the operation clearer) and SQL is so familiar to programmers that are power users.
I never know when I have to use find vs grep. And linux grep is different from macOS grep so I google about it every day lol. I just never figure it out.
I think I'll be a heavy user of FSQL.
Bill Gates described it as his biggest product regret (2).
I remember I thought it was brilliant. Too bad it was probably a little bit too futuristic for its time, as for a few other things they launched when it just was not the right time... the clunky Tablet PCs (3) were for sure another example.
(1) https://en.m.wikipedia.org/wiki/WinFS
(2) http://www.zdnet.com/article/bill-gates-biggest-microsoft-pr...
(3) https://en.m.wikipedia.org/wiki/Microsoft_Tablet_PC