scylla_cql/deserialize/mod.rs
1//! Framework for deserialization of data returned by database queries.
2//!
3//! Deserialization is based on two traits:
4//!
5//! - A type that implements `DeserializeValue<'frame, 'metadata>` can be deserialized
6//! from a single _CQL value_ - i.e. an element of a row in the query result,
7//! - A type that implements `DeserializeRow<'frame, 'metadata>` can be deserialized
8//! from a single _row_ of a query result.
9//!
10//! Those traits are quite similar to each other, both in the idea behind them
11//! and the interface that they expose.
12//!
13//! It's important to understand what is a _deserialized type_. It's not just
14//! an implementor of Deserialize{Value, Row}; there are some implementors of
15//! `Deserialize{Value, Row}` who are not yet final types, but **partially**
16//! deserialized types that support further deserialization - _type
17//! deserializers_, such as `ListlikeIterator`, `UdtIterator` or `ColumnIterator`.
18//!
19//! # Lifetime parameters
20//!
21//! - `'frame` is the lifetime of the frame. Any deserialized type that is going to borrow
22//! from the frame must have its lifetime bound by `'frame`.
23//! - `'metadata` is the lifetime of the result metadata. As result metadata is only needed
24//! for the very deserialization process and the **final** deserialized types (i.e. those
25//! that are not going to deserialize anything else, opposite of e.g. `MapIterator`) can
26//! later live independently of the metadata, this is different from `'frame`.
27//!
28//! _Type deserializers_, as they still need to deserialize some type, are naturally bound
29//! by 'metadata lifetime. However, final types are completely deserialized, so they should
30//! not be bound by 'metadata - only by 'frame.
31//!
32//! Rationale:
33//! `DeserializeValue` requires two types of data in order to perform
34//! deserialization:
35//! 1) a reference to the CQL frame (a FrameSlice),
36//! 2) the type of the column being deserialized, being part of the
37//! ResultMetadata.
38//!
39//! Similarly, `DeserializeRow` requires two types of data in order to
40//! perform deserialization:
41//! 1) a reference to the CQL frame (a FrameSlice),
42//! 2) a slice of specifications of all columns in the row, being part of
43//! the ResultMetadata.
44//!
45//! When deserializing owned types, both the frame and the metadata can have
46//! any lifetime and it's not important. When deserializing borrowed types,
47//! however, they borrow from the frame, so their lifetime must necessarily
48//! be bound by the lifetime of the frame. Metadata is only needed for the
49//! deserialization, so its lifetime does not abstractly bound the
50//! deserialized value. Not to unnecessarily shorten the deserialized
51//! values' lifetime to the metadata's lifetime (due to unification of
52//! metadata's and frame's lifetime in value deserializers), a separate
53//! lifetime parameter is introduced for result metadata: `'metadata`.
54//!
55//! # `type_check` and `deserialize`
56//!
57//! The deserialization process is divided into two parts: type checking and
58//! actual deserialization, represented by `DeserializeValue`/`DeserializeRow`'s
59//! methods called `type_check` and `deserialize`.
60//!
61//! The `deserialize` method can assume that `type_check` was called before, so
62//! it doesn't have to verify the type again. This can be a performance gain
63//! when deserializing query results with multiple rows: as each row in a result
64//! has the same type, it is only necessary to call `type_check` once for the
65//! whole result and then `deserialize` for each row.
66//!
67//! Note that `deserialize` is not an `unsafe` method - although you can be
68//! sure that the driver will call `type_check` before `deserialize`, you
69//! shouldn't do unsafe things based on this assumption.
70//!
71//! # Data ownership
72//!
73//! Some CQL types can be easily consumed while still partially serialized.
74//! For example, types like `blob` or `text` can be just represented with
75//! `&[u8]` and `&str` that just point to a part of the serialized response.
76//! This is more efficient than using `Vec<u8>` or `String` because it avoids
77//! an allocation and a copy, however it is less convenient because those types
78//! are bound with a lifetime.
79//!
80//! The framework supports types that refer to the serialized response's memory
81//! in three different ways:
82//!
83//! ## Owned types
84//!
85//! Some types don't borrow anything and fully own their data, e.g. `i32` or
86//! `String`. They aren't constrained by any lifetime and should implement
87//! the respective trait for _all_ lifetimes, i.e.:
88//!
89//! ```rust
90//! # use scylla_cql::frame::response::result::{NativeType, ColumnType};
91//! # use scylla_cql::deserialize::{DeserializationError, FrameSlice, TypeCheckError};
92//! # use scylla_cql::deserialize::value::DeserializeValue;
93//! use thiserror::Error;
94//! struct MyVec(Vec<u8>);
95//! #[derive(Debug, Error)]
96//! enum MyDeserError {
97//! #[error("Expected bytes")]
98//! ExpectedBytes,
99//! #[error("Expected non-null")]
100//! ExpectedNonNull,
101//! }
102//! impl<'frame, 'metadata> DeserializeValue<'frame, 'metadata> for MyVec {
103//! fn type_check(typ: &ColumnType) -> Result<(), TypeCheckError> {
104//! if let ColumnType::Native(NativeType::Blob) = typ {
105//! return Ok(());
106//! }
107//! Err(TypeCheckError::new(MyDeserError::ExpectedBytes))
108//! }
109//!
110//! fn deserialize(
111//! _typ: &'metadata ColumnType<'metadata>,
112//! v: Option<FrameSlice<'frame>>,
113//! ) -> Result<Self, DeserializationError> {
114//! v.ok_or_else(|| DeserializationError::new(MyDeserError::ExpectedNonNull))
115//! .map(|v| Self(v.as_slice().to_vec()))
116//! }
117//! }
118//! ```
119//!
120//! ## Borrowing types
121//!
122//! Some types do not fully contain their data but rather will point to some
123//! bytes in the serialized response, e.g. `&str` or `&[u8]`. Those types will
124//! usually contain a lifetime in their definition. In order to properly
125//! implement `DeserializeValue` or `DeserializeRow` for such a type, the `impl`
126//! should still have a generic lifetime parameter, but the lifetimes from the
127//! type definition should be constrained with the generic lifetime parameter.
128//! For example:
129//!
130//! ```rust
131//! # use scylla_cql::frame::response::result::{NativeType, ColumnType};
132//! # use scylla_cql::deserialize::{DeserializationError, FrameSlice, TypeCheckError};
133//! # use scylla_cql::deserialize::value::DeserializeValue;
134//! use thiserror::Error;
135//! struct MySlice<'a>(&'a [u8]);
136//! #[derive(Debug, Error)]
137//! enum MyDeserError {
138//! #[error("Expected bytes")]
139//! ExpectedBytes,
140//! #[error("Expected non-null")]
141//! ExpectedNonNull,
142//! }
143//! impl<'a, 'frame, 'metadata> DeserializeValue<'frame, 'metadata> for MySlice<'a>
144//! where
145//! 'frame: 'a,
146//! {
147//! fn type_check(typ: &ColumnType) -> Result<(), TypeCheckError> {
148//! if let ColumnType::Native(NativeType::Blob) = typ {
149//! return Ok(());
150//! }
151//! Err(TypeCheckError::new(MyDeserError::ExpectedBytes))
152//! }
153//!
154//! fn deserialize(
155//! _typ: &'metadata ColumnType<'metadata>,
156//! v: Option<FrameSlice<'frame>>,
157//! ) -> Result<Self, DeserializationError> {
158//! v.ok_or_else(|| DeserializationError::new(MyDeserError::ExpectedNonNull))
159//! .map(|v| Self(v.as_slice()))
160//! }
161//! }
162//! ```
163//!
164//! ## Reference-counted types
165//!
166//! Internally, the driver uses the `bytes::Bytes` type to keep the contents
167//! of the serialized response. It supports creating derived `Bytes` objects
168//! which point to a subslice but keep the whole, original `Bytes` object alive.
169//!
170//! During deserialization, a type can obtain a `Bytes` subslice that points
171//! to the serialized value. This approach combines advantages of the previous
172//! two approaches - creating a derived `Bytes` object can be cheaper than
173//! allocation and a copy (it supports `Arc`-like semantics) and the `Bytes`
174//! type is not constrained by a lifetime. However, you should be aware that
175//! the subslice will keep the whole `Bytes` object that holds the frame alive.
176//! It is not recommended to use this approach for long-living objects because
177//! it can introduce space leaks.
178//!
179//! Example:
180//!
181//! ```rust
182//! # use scylla_cql::frame::response::result::{NativeType, ColumnType};
183//! # use scylla_cql::deserialize::{DeserializationError, FrameSlice, TypeCheckError};
184//! # use scylla_cql::deserialize::value::DeserializeValue;
185//! # use bytes::Bytes;
186//! use thiserror::Error;
187//! struct MyBytes(Bytes);
188//! #[derive(Debug, Error)]
189//! enum MyDeserError {
190//! #[error("Expected bytes")]
191//! ExpectedBytes,
192//! #[error("Expected non-null")]
193//! ExpectedNonNull,
194//! }
195//! impl<'frame, 'metadata> DeserializeValue<'frame, 'metadata> for MyBytes {
196//! fn type_check(typ: &ColumnType) -> Result<(), TypeCheckError> {
197//! if let ColumnType::Native(NativeType::Blob) = typ {
198//! return Ok(());
199//! }
200//! Err(TypeCheckError::new(MyDeserError::ExpectedBytes))
201//! }
202//!
203//! fn deserialize(
204//! _typ: &'metadata ColumnType<'metadata>,
205//! v: Option<FrameSlice<'frame>>,
206//! ) -> Result<Self, DeserializationError> {
207//! v.ok_or_else(|| DeserializationError::new(MyDeserError::ExpectedNonNull))
208//! .map(|v| Self(v.to_bytes()))
209//! }
210//! }
211//! ```
212
213pub mod frame_slice;
214pub mod result;
215pub mod row;
216pub mod value;
217
218pub use frame_slice::FrameSlice;
219
220use std::error::Error;
221use std::sync::Arc;
222
223use thiserror::Error;
224
225// Errors
226
227/// An error indicating that a failure happened during type check.
228///
229/// The error is type-erased so that the crate users can define their own
230/// type check impls and their errors.
231/// As for the impls defined or generated
232/// by the driver itself, the following errors can be returned:
233///
234/// - [`row::BuiltinTypeCheckError`] is returned when type check of
235/// one of types with an impl built into the driver fails. It is also returned
236/// from impls generated by the `DeserializeRow` macro.
237/// - [`value::BuiltinTypeCheckError`] is analogous to the above but is
238/// returned from [`DeserializeValue::type_check`](value::DeserializeValue::type_check)
239/// instead both in the case of builtin impls and impls generated by the
240/// `DeserializeValue` macro.
241/// It won't be returned by the `Session` directly, but it might be nested
242/// in the [`row::BuiltinTypeCheckError`].
243#[derive(Debug, Clone, Error)]
244#[error("TypeCheckError: {0}")]
245pub struct TypeCheckError(pub(crate) Arc<dyn std::error::Error + Send + Sync>);
246
247impl TypeCheckError {
248 /// Constructs a new `TypeCheckError`.
249 #[inline]
250 pub fn new(err: impl std::error::Error + Send + Sync + 'static) -> Self {
251 Self(Arc::new(err))
252 }
253
254 /// Retrieve an error reason by downcasting to specific type.
255 pub fn downcast_ref<T: std::error::Error + 'static>(&self) -> Option<&T> {
256 self.0.downcast_ref()
257 }
258}
259
260/// An error indicating that a failure happened during deserialization.
261///
262/// The error is type-erased so that the crate users can define their own
263/// deserialization impls and their errors. As for the impls defined or generated
264/// by the driver itself, the following errors can be returned:
265///
266/// - [`row::BuiltinDeserializationError`] is returned when deserialization of
267/// one of types with an impl built into the driver fails. It is also returned
268/// from impls generated by the `DeserializeRow` macro.
269/// - [`value::BuiltinDeserializationError`] is analogous to the above but is
270/// returned from [`DeserializeValue::deserialize`](value::DeserializeValue::deserialize)
271/// instead both in the case of builtin impls and impls generated by the
272/// `DeserializeValue` macro.
273/// It won't be returned by the `Session` directly, but it might be nested
274/// in the [`row::BuiltinDeserializationError`].
275#[derive(Debug, Clone, Error)]
276#[error("DeserializationError: {0}")]
277pub struct DeserializationError(Arc<dyn Error + Send + Sync>);
278
279impl DeserializationError {
280 /// Constructs a new `DeserializationError`.
281 #[inline]
282 pub fn new(err: impl Error + Send + Sync + 'static) -> Self {
283 Self(Arc::new(err))
284 }
285
286 /// Retrieve an error reason by downcasting to specific type.
287 pub fn downcast_ref<T: Error + 'static>(&self) -> Option<&T> {
288 self.0.downcast_ref()
289 }
290}
291
292// This is a hack to enable setting the proper Rust type name in error messages,
293// even though the error originates from some helper type used underneath.
294// ASSUMPTION: This should be used:
295// - ONLY in proper type_check()/deserialize() implementation,
296// - BEFORE an error is cloned (because otherwise the Arc::get_mut fails).
297macro_rules! make_error_replace_rust_name {
298 ($privacy: vis, $fn_name: ident, $outer_err: ty, $inner_err: ty) => {
299 // Not part of the public API; used in derive macros.
300 #[doc(hidden)]
301 #[allow(clippy::needless_pub_self)]
302 $privacy fn $fn_name<RustT>(mut err: $outer_err) -> $outer_err {
303 // Safety: the assumed usage of this function guarantees that the Arc has not yet been cloned.
304 let arc_mut = std::sync::Arc::get_mut(&mut err.0).unwrap();
305
306 let rust_name: &mut &str = {
307 if let Some(err) = arc_mut.downcast_mut::<$inner_err>() {
308 &mut err.rust_name
309 } else {
310 unreachable!(concat!(
311 "This function is assumed to be called only on built-in ",
312 stringify!($inner_err),
313 " kinds."
314 ))
315 }
316 };
317
318 *rust_name = std::any::type_name::<RustT>();
319 err
320 }
321 };
322}
323use make_error_replace_rust_name;
324
325#[cfg(test)]
326pub(crate) mod tests {
327 use bytes::{Bytes, BytesMut};
328
329 use crate::frame::response::result::{ColumnSpec, ColumnType, TableSpec};
330 use crate::frame::types;
331
332 pub(super) static CELL1: &[u8] = &[1, 2, 3];
333 pub(super) static CELL2: &[u8] = &[4, 5, 6, 7];
334
335 pub(super) fn serialize_cells(
336 cells: impl IntoIterator<Item = Option<impl AsRef<[u8]>>>,
337 ) -> Bytes {
338 let mut bytes = BytesMut::new();
339 for cell in cells {
340 types::write_bytes_opt(cell, &mut bytes).unwrap();
341 }
342 bytes.freeze()
343 }
344
345 pub(crate) const fn spec<'a>(name: &'a str, typ: ColumnType<'a>) -> ColumnSpec<'a> {
346 ColumnSpec::borrowed(name, typ, TableSpec::borrowed("ks", "tbl"))
347 }
348}