Now if there was a way to add prepaid credits and monitor usage near real-time o...

Hawkenfall · 2025-05-06T15:27:23 1746545243

You can do this with https://openrouter.ai/

pzo · 2025-05-06T16:57:24 1746550644

but if you want to use google SDK (python-genai, js-genai) rather than openai SDK (If found google api more feature rich when using different modality like audio/images/video) you cannot use openrouter. Also not sure if you are developing app and needs higher rate limits - what's typical rate limit via openrouter?

pzo · 2025-05-06T17:00:07 1746550807

also for some reason I tested simple prompt (few words, no system prompt) with attached 1 images and openrouter charged me like ~1700 tokens when on the other hand using directly via python-genai its like ~400 tokens. Also keep in mind they charge small markup fee when you top you their account.

simple10 · 2025-05-06T16:52:22 1746550342

You can do this with LLM proxies like LiteLLM. e.g. Cursor -> LiteLLM -> LLM provider API.

I have LiteLLM server running locally with Langfuse to view traces. You configure LiteLLM to connect directly to providers' APIs. This has the added benefit of being able to create LiteLLM API keys per project that proxies to different sets of provider API keys to monitor or cap billing usage.

I use https://github.com/LLemonStack/llemonstack/ to spin up local instances of LiteLLM and Langfuse.

tucnak · 2025-05-06T15:35:01 1746545701

You need LLM Ops. YC happens to have invested in Langfuse, which is if you're serious about tracking metrics, you'll appreciate the rest, too.

And before you ask: yes, for cached content and batch completion discounts you can accommodate both—just needs a bit of logic in your completion-layer code.

greenavocado · 2025-05-06T15:26:08 1746545168

You can do that by using deepinfra to manage your billing. It's pay-as-you-go and they have a pass-through virtual target for Google Gemini.

Deepinfra token usage updates every time you switch to the tab if it is opened to the usage page so it is possible to see updates even every second

therealmarv · 2025-05-06T15:41:44 1746546104

Is this on Google AI Studio or Google Vertex or both?

slig · 2025-05-06T15:27:53 1746545273

In in the meantime, I'm using openrouter.

cchance · 2025-05-06T15:32:26 1746545546

openrouter, i dont think anyone should use google direct till they fix their shit billing

greenavocado · 2025-05-06T15:43:30 1746546210

Even afterwards. Avoid paying directly if you can because they generally could not care less about individuals.

You have less than $10 million in spend you will be treated worse than cattle because at least farmers feed their cattle before they are milked