Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: Latex-wc – Word count and word frequency for LaTeX projects (github.com/sethbarrett50)
10 points by sethbarrettAU 1 day ago | hide | past | favorite | 7 comments
I was revising my proposal defense and kept feeling like I was repeating the same term. In a typical LaTeX project split across many .tex files, it’s awkward to get a quick, clean word-frequency view without gluing everything together or counting LaTeX commands/math as “words”.

So I built latex-wc, a small Python CLI that:

- extracts tokens from LaTeX while ignoring common LaTeX “noise” (commands, comments, math, refs/cites, etc.)

- can take a single .tex file or a directory and recursively scan all *.tex files

- prints a combined report once (total words, unique words, top-N frequencies)

Fastest way to try it is `uvx latex-wc [path]` (file or directory). Feedback welcome, especially on edge cases where you think the heuristic filters are too aggressive or not aggressive enough.





Are you aware of the "texcount" program [0] that's distributed with TeX Live by default?

[0]: https://ctan.org/pkg/texcount?lang=en


  detex "$@" | wc
  detex "$@" | tr -cs '[:alnum:]' '\n' | grep . | tr '[:upper:]' '[:lower:]' | sort | uniq -c | sort -rn

We need a link!


Added above. Thanks!

I think the link to source code repository would be better

https://github.com/sethbarrett50/LaTeX-wc


Changed to that. Thanks!



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: