-
Notifications
You must be signed in to change notification settings - Fork 285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect type inferred, creating invalid and inconsistent jsonb values #678
Comments
Okay, I did some additional digging and I am more confident that this is actually an issue with this library. With an initial value in my
I also tried updating the I suspect that this is coming down to the wrong type being inferred by the library, resulting in an undesirable cast happening before the query is run, but I don't quite understand how to alter/predict that behavior. Part of what is driving this suspicion is that I am running in a serverless environment with semi-ephemeral connections, and these issues are happening in bursts—I'll get ~3 rows mangled in a 60 second period, and then not again for 20+ minutes. |
You shouldn't use the sql.json() helper unless you used pre v3 of postgres and know the reasons 😊 It is deprecated, and json is handled implicitly. Try to remove the use of sql.json |
@porsager I replaced it with |
I am manually fixing invalid rows in a production database that are causing errors (a few new ones every hour). I've tried everything I can think of to make this work, but I'm at the point that I'm considering migrating the codebase to Slonik, which is not an inconsiderable amount of work. I've tried looking through the code, but I don't have a great grasp of how the connection with Postgres works, which seems to be at the root of the issue. So if you have any concrete suggestions or explanations for what is happening other than just "don't use this method that you are not using for some queries that still have this issue", I'd appreciate it. |
Just pass it raw
|
Sure - just adding what I can considering I'm only on my mobile currently. If you're not able to make a reproducible case it's hard to say where things might go wrong. Are you storing only objects / arrays, or also simple values like string, number, boolean directly? I use json a lot myself and in other projects, and so do many other users, and I've never heard of a case where same data would some time be saved correctly and some time not. It's probably much more likely you have input values that are sometimes already stringifyed, and some times not. Postgres.js doesn't magically fix that for you, and neither does slonik. |
Cool, that works locally, but TypeScript doesn't like it:
In any case, I don't think that'll fix the issue since it's happening with this query as well: const DecodableInput = z.object({
objectiveId: ObjectiveId,
params: z.object({
// some more fields; happy to share, just trying to be brief
}),
content: z.string().trim().min(1).max(2000),
generated_by: z.string().uuid().optional(),
})
type DecodableInput = z.infer<typeof DecodableInput>
export const storeDecodable = async (input: DecodableInput) => {
// None of the `input` is user-provided aside from `objectiveId`, which is
// validated to be an integer in the specified range (9–109) by the API
// route handler, again using Zod. `snakecaseKeys` comes from the package
// named `snakecase-keys`.
const data = snakecaseKeys(DecodableInput.parse(input), { deep: false })
const rows = await sql<{ public_id: string }[]>`
INSERT INTO decodables ${sql(data)}
RETURNING public_id
`
if (rows.length !== 1) {
throw new Error("Failed to insert decodable")
}
return rows[0].public_id
} Can you tell me more about how this library handles type inference? I am wondering if it's plausible that the database connection (Neon via Vercel Storage) is getting into a bad state when a serverless func has been running for a while. I have a |
I've had a couple instances where two columns on a single row that are saved back to back in the same HTTP request (but separate SQL queries) with disparate values (no relation to one another aside from them both being jsonb cols) were both stored stringified. The two queries are very different:
This seems pretty convincing to me—it's something downstream of a call to Edit: I've also now been able to trigger this behavior with known, valid input, eliminating some kind of bungled browser input as a possibility. |
For what it's worth, I switched these queries to Slonik as verbatim as possible and the issue has gone away. |
Hey @coreyward - sorry for they wait - deadlines at work 😅 I know you might've moved on, but I'd love to dig in if you can give me some leads or a reproducible case. I've looked for any clues, but haven't been able to come close to anything that should cause this, unless it could be related to #392 . Crossing my fingers you might have a lead for me. |
Hey, appreciate the follow up! It seems plausible that #392 is related, but to be honest, it's a bit over my head. I never experienced the issue locally, just on Vercel, and we have been having networking issues with Vercel’s infrastructure, so it's possible that this only arose as a side effect of complications with that environment. I never did figure out what specific circumstances triggered it, unfortunately. Wish I could do more to help narrow things down, but I don't know what else would be helpful. Do you have any thoughts? |
I just found some cases where this happened on my local Postgres instance with the locally running app, so it seems the complications with networking are unrelated. |
Interesting, we've been having similar issues lately, in our case it did coincide with a pooler change which we thought could be the origin of the issue but the problem is exactly like you describe and I believe one of us has been able to get a similar result with a different connection method, although it is pretty rare (it happens a lot with one, not so much with the other). Will report back if and when I find something that could get to a repro. If what I describe rings a bell do let me know if there are areas I should poke around first. |
Been poking a bit. Somehow calling describe() between every call makes the problem disappear. Things I tried as well:
1 // Simplified example, entry data is an object, one of the fields is an object that maps to jsonb
2 for (const entryData of entries) {
3 const insertedId = await sql`INSERT INTO x ${sql(entryData)} ON CONFLICT ... RETURNING id`;
4 // Same query. Adding this makes query on line 3 work 100% of the time.
5 const sql`/* same query */`.describe()
6 if (insertedId) { /* run dependent queries */ }
7 } This is the describe output:
|
And this is the .statement for a good and a bad insert. In both cases, the types are
|
Finally, it seems that setting the client We don't manage the DB so I have limited details in the configuration, it's using https://github.com/supabase/supavisor as a pooler, and the one that fails less is on PgBouncer, both on Supabase. I just realized it does indeed run in transaction mode, so maybe that's what @coreyward was experiencing as well. Docs say PgBouncer can be made to support it but I don't know if theirs does. Supavisor just doesn't mention it. Now, I don't know enough about Postgres connections to understand if that means the issue is in the pooler and I should open an issue there, or if it's on the query side. Thoughts? |
That is some very nice debugging @plauclair ! I think you are on to something with regards to the poolers, but normally pgbouncer will complain if trying to use named prepared statements when it is running in transaction mode. ( https://github.com/porsager/postgres#prepared-statements ) |
That's what I thought as well, so this is a bit surprising. I'm wondering if maybe they aren't using a custom fork of PgBouncer introducing unexpected behavior. Right now, I don't have a reason to suspect your library to be the cause of this issue. 👍 |
If there are any news in this case, feel free to reopen. |
I am experiencing a few bugs related to the format of a value inserted into a JSONB column varying. The results are non-deterministic—I can run the same query with the same values back to back, and it only occasionally winds up formatted differently.
In one bug, the value (a nested JS object) is stringified before insertion into the column as an escaped string (e.g.,
"{\"foo\": \"bar\"}"
). The query for this is pretty simple.INSERT INTO my_table ${sql(data)} RETURNING id
. This is happening about 3% of the time.In another I'm doing a merge with an existing object value and it's winding up an array somehow. This is happening less often—about 0.1% of the time. The SQL query for this is again relatively simple, leaning on
sql
to do the heavy lifting:Since I don't know where this is coming from and it's so infrequent it is hard to put together a reproduction, but I am using Zod to parse and validate all inputs before they're passed to
sql
, so I am reasonably sure that it is not merely the input being in the wrong format (e.g., notypeof value === "object"
letting arrays slip through).If there are any things I can do to capture additional information or avoid this even without fully knowing what is happening I am all ears. 🤞 🙏
The text was updated successfully, but these errors were encountered: