將PostgreSQL bytea存儲的serialized-java-UUID轉換為postgresql-UUID

[英]Convert PostgreSQL bytea-stored serialized-java-UUID to postgresql-UUID


One of our software-projects uses a PostgreSQL-table with a column 'guid' of type bytea.

我們的一個軟件項目使用PostgreSQL表,其中列為'guid',類型為bytea。

This is used with hibernate 3.3.2.GA with PostgreSQL 8.4, which serializes the java UUID type using java object serialization. The result is a value like the following escape format bytea literal:

這與hibernate 3.3.2.GA和PostgreSQL 8.4一起使用,它使用java對象序列化來序列化java UUID類型。結果是一個類似下面的轉義格式bytea文字:

'\254\355\000\005sr\000\016java.util.UUID\274\231\003\367\230m\205/\002\000\002‌​J\000\014leastSigBitsJ\000\013mostSigBitsxp\273\222)\360*r\322\262u\274\310\020\3‌​42\004M '

... which we cannot easily use in a query as select or condition to retrieve relevant rows.

...我們不能輕易地在查詢中使用select或條件來檢索相關行。

Does anyone have a way to read or use the bytea-column in the select- or where-parts of a query (e.g. via psql or pgadmin3), without setting up some hibernate-query?

有沒有人有辦法在查詢的select-或where-parts中讀取或使用bytea-column(例如通過psql或pgadmin3),而無需設置一些hibernate-query?

3 个解决方案

#1


6  

Update: See edit to question, this answer applies to the commonplace 16-byte serializations of uuid; the question was amended to reflect java serialization.

更新:請參閱編輯問題,此答案適用於uuid的常見16字節序列化;修改問題以反映java序列化。


Interesting problem. I landed up writing a simple C extension to do it efficiently, but it's probably more sensible to use the PL/Python version below.

有趣的問題。我開始編寫一個簡單的C擴展來高效地完成它,但是使用下面的PL / Python版本可能更為明智。

Because uuid is a fixed sized type and bytea is varlena you can't just create cast ... as implicit to binary-coerce them, because the variable length field header would get in the way.

因為uuid是一個固定大小的類型,而bytea是varlena,你不能只創建強制轉換為二進制強制轉換它們,因為可變長度字段標題會妨礙它。

There's no built-in function for bytea input to return a uuid. It'd be a handy thing to have, but I don't think anyone's done it yet.

bytea輸入沒有內置函數來返回uuid。這是一個方便的事情,但我認為還沒有人做過。

Simplest way

Update: There's actually a simple way to do this. bytea in hex form is actually a valid uuid literal once the \x is stripped off, because uuid_in accepts plain undecorated hex without - or {}. So just:

更新:實際上有一種簡單的方法可以做到這一點。一旦\ x被剝離,十六進制形式的bytea實際上是一個有效的uuid文字,因為uuid_in接受沒有 - 或{}的普通未修飾的十六進制。所以就:

regress=> SET bytea_output = 'hex';
SET
regress=> SELECT CAST( substring(CAST (BYTEA '\x0FCC6350118D11E4A5597DE5338EB025' AS text) from 3) AS uuid);
              substring               
--------------------------------------
 0fcc6350-118d-11e4-a559-7de5338eb025
(1 row)

It involves a couple of string copies and a hex encode/decode cycle, but it'll be tons faster than any of the PL answers I suggested earlier, though slower than C.

它涉及幾個字符串副本和一個十六進制編碼/解碼周期,但它比我之前建議的任何PL答案快得多,但比C慢。

Other options

Personally I recommend using PL/Perl or pl/pythonu. I'll follow up with an example.

我個人建議使用PL / Perl或pl / pythonu。我會跟進一個例子。

Assuming your uuid is the hex-format bytea literal:

假設你的uuid是十六進制格式的bytea文字:

'\x0FCC6350118D11E4A5597DE5338EB025'

you could turn it into a uuid type with:

你可以把它變成一個uuid類型:

PL/Perl

create language plperlu;

create or replace function to_uuid(bytea) returns uuid language plperlu immutable as $$
use Data::UUID;
my $ug = new Data::UUID;
my $uuid = $ug->from_hexstring(substr($_[0],2));
return $ug->to_string($uuid);
$$
SET bytea_output = hex;

SELECT to_uuid(BYTEA '\x0FCC6350118D11E4A5597DE5338EB025');

PL/Python

It's probably faster and cleaner in Python because the PL/Python interface passes bytea as raw bytes not as hex strings:

它在Python中可能更快更干凈,因為PL / Python接口將bytea作為原始字節而不是十六進制字符串傳遞:

CREATE LANGUAGE plpythonu;

CREATE or replace function to_uuid(uuidbytes bytea) 
RETURNS uuid LANGUAGE plpythonu IMMUTABLE 
AS $$
import uuid
return uuid.UUID(bytes=uuidbytes)
$$;

SELECT to_uuid(BYTEA '\x0FCC6350118D11E4A5597DE5338EB025');

In C, just for kicks. Ugly hack.

You can see the C extension module here.

您可以在此處查看C擴展模塊。

But really, I mean it about it being ugly. If you want it done properly in C, it's best to actually patch PostgreSQL rather than use an extension.

但實際上,我的意思是說它很難看。如果你想在C中正確完成它,最好實際修補PostgreSQL而不是使用擴展。

#2


0  

After some trial and error I have created the following function to extract the postgresql-UUID value:

經過一些試驗和錯誤后,我創建了以下函數來提取postgresql-UUID值:

CREATE OR REPLACE FUNCTION bytea2uuid (x bytea) RETURNS uuid as $$ SELECT encode(substring(x, 73, 8) || substring(x, 65, 8), 'hex')::uuid $$ language sql;

創建或替換函數bytea2uuid(x bytea)RETURNS uuid為$$ SELECT encode(substring(x,73,8)|| substring(x,65,8),'hex'):: uuid $$ language sql;

This works by extracting the bytes used in the java long-values for leastSigBits and mostSigBits (which are stored in reversed order), than encoding to hex and casting to type 'uuid'.

這可以通過提取java長值中使用的字節為leastSigBits和mostSigBits(以相反的順序存儲),而不是編碼為十六進制和轉換為類型'uuid'。

Used as follows: select bytea2uuid(guid) as guid from documents limit 1;

用法如下:從文件限制1中選擇bytea2uuid(guid)作為guid;

"75bcc810-e204-4d20-bb92-29f02a72d2b2"

#3


0  

This works for me:

這對我有用:

ALTER TABLE myTable ALTER COLUMN id TYPE uuid USING CAST(ENCODE(id, 'hex') AS uuid);
关注微信公众号

注意!

本站翻译的文章,版权归属于本站,未经许可禁止转摘,转摘请注明本文地址:https://www.itdaan.com/blog/2014/07/22/1df60770703069a6208afe6ef9c4c206.html



 
粤ICP备14056181号  © 2014-2020 ITdaan.com