scylla_cql/deserialize/
mod.rs

1//! Framework for deserialization of data returned by database queries.
2//!
3//! Deserialization is based on two traits:
4//!
5//! - A type that implements `DeserializeValue<'frame, 'metadata>` can be deserialized
6//!   from a single _CQL value_ - i.e. an element of a row in the query result,
7//! - A type that implements `DeserializeRow<'frame, 'metadata>` can be deserialized
8//!   from a single _row_ of a query result.
9//!
10//! Those traits are quite similar to each other, both in the idea behind them
11//! and the interface that they expose.
12//!
13//! It's important to understand what is a _deserialized type_. It's not just
14//! an implementor of Deserialize{Value, Row}; there are some implementors of
15//! `Deserialize{Value, Row}` who are not yet final types, but **partially**
16//! deserialized types that support further deserialization - _type
17//! deserializers_, such as `ListlikeIterator`, `UdtIterator` or `ColumnIterator`.
18//!
19//! # Lifetime parameters
20//!
21//! - `'frame` is the lifetime of the frame. Any deserialized type that is going to borrow
22//!   from the frame must have its lifetime bound by `'frame`.
23//! - `'metadata` is the lifetime of the result metadata. As result metadata is only needed
24//!   for the very deserialization process and the **final** deserialized types (i.e. those
25//!   that are not going to deserialize anything else, opposite of e.g. `MapIterator`) can
26//!   later live independently of the metadata, this is different from `'frame`.
27//!
28//! _Type deserializers_, as they still need to deserialize some type, are naturally bound
29//! by 'metadata lifetime. However, final types are completely deserialized, so they should
30//! not be bound by 'metadata - only by 'frame.
31//!
32//! Rationale:
33//! `DeserializeValue` requires two types of data in order to perform
34//! deserialization:
35//! 1) a reference to the CQL frame (a FrameSlice),
36//! 2) the type of the column being deserialized, being part of the
37//!    ResultMetadata.
38//!
39//! Similarly, `DeserializeRow` requires two types of data in order to
40//! perform deserialization:
41//! 1) a reference to the CQL frame (a FrameSlice),
42//! 2) a slice of specifications of all columns in the row, being part of
43//!    the ResultMetadata.
44//!
45//! When deserializing owned types, both the frame and the metadata can have
46//! any lifetime and it's not important. When deserializing borrowed types,
47//! however, they borrow from the frame, so their lifetime must necessarily
48//! be bound by the lifetime of the frame. Metadata is only needed for the
49//! deserialization, so its lifetime does not abstractly bound the
50//! deserialized value. Not to unnecessarily shorten the deserialized
51//! values' lifetime to the metadata's lifetime (due to unification of
52//! metadata's and frame's lifetime in value deserializers), a separate
53//! lifetime parameter is introduced for result metadata: `'metadata`.
54//!
55//! # `type_check` and `deserialize`
56//!
57//! The deserialization process is divided into two parts: type checking and
58//! actual deserialization, represented by `DeserializeValue`/`DeserializeRow`'s
59//! methods called `type_check` and `deserialize`.
60//!
61//! The `deserialize` method can assume that `type_check` was called before, so
62//! it doesn't have to verify the type again. This can be a performance gain
63//! when deserializing query results with multiple rows: as each row in a result
64//! has the same type, it is only necessary to call `type_check` once for the
65//! whole result and then `deserialize` for each row.
66//!
67//! Note that `deserialize` is not an `unsafe` method - although you can be
68//! sure that the driver will call `type_check` before `deserialize`, you
69//! shouldn't do unsafe things based on this assumption.
70//!
71//! # Data ownership
72//!
73//! Some CQL types can be easily consumed while still partially serialized.
74//! For example, types like `blob` or `text` can be just represented with
75//! `&[u8]` and `&str` that just point to a part of the serialized response.
76//! This is more efficient than using `Vec<u8>` or `String` because it avoids
77//! an allocation and a copy, however it is less convenient because those types
78//! are bound with a lifetime.
79//!
80//! The framework supports types that refer to the serialized response's memory
81//! in three different ways:
82//!
83//! ## Owned types
84//!
85//! Some types don't borrow anything and fully own their data, e.g. `i32` or
86//! `String`. They aren't constrained by any lifetime and should implement
87//! the respective trait for _all_ lifetimes, i.e.:
88//!
89//! ```rust
90//! # use scylla_cql::frame::response::result::{NativeType, ColumnType};
91//! # use scylla_cql::deserialize::{DeserializationError, FrameSlice, TypeCheckError};
92//! # use scylla_cql::deserialize::value::DeserializeValue;
93//! use thiserror::Error;
94//! struct MyVec(Vec<u8>);
95//! #[derive(Debug, Error)]
96//! enum MyDeserError {
97//!     #[error("Expected bytes")]
98//!     ExpectedBytes,
99//!     #[error("Expected non-null")]
100//!     ExpectedNonNull,
101//! }
102//! impl<'frame, 'metadata> DeserializeValue<'frame, 'metadata> for MyVec {
103//!     fn type_check(typ: &ColumnType) -> Result<(), TypeCheckError> {
104//!          if let ColumnType::Native(NativeType::Blob) = typ {
105//!              return Ok(());
106//!          }
107//!          Err(TypeCheckError::new(MyDeserError::ExpectedBytes))
108//!      }
109//!
110//!      fn deserialize(
111//!          _typ: &'metadata ColumnType<'metadata>,
112//!          v: Option<FrameSlice<'frame>>,
113//!      ) -> Result<Self, DeserializationError> {
114//!          v.ok_or_else(|| DeserializationError::new(MyDeserError::ExpectedNonNull))
115//!             .map(|v| Self(v.as_slice().to_vec()))
116//!      }
117//! }
118//! ```
119//!
120//! ## Borrowing types
121//!
122//! Some types do not fully contain their data but rather will point to some
123//! bytes in the serialized response, e.g. `&str` or `&[u8]`. Those types will
124//! usually contain a lifetime in their definition. In order to properly
125//! implement `DeserializeValue` or `DeserializeRow` for such a type, the `impl`
126//! should still have a generic lifetime parameter, but the lifetimes from the
127//! type definition should be constrained with the generic lifetime parameter.
128//! For example:
129//!
130//! ```rust
131//! # use scylla_cql::frame::response::result::{NativeType, ColumnType};
132//! # use scylla_cql::deserialize::{DeserializationError, FrameSlice, TypeCheckError};
133//! # use scylla_cql::deserialize::value::DeserializeValue;
134//! use thiserror::Error;
135//! struct MySlice<'a>(&'a [u8]);
136//! #[derive(Debug, Error)]
137//! enum MyDeserError {
138//!     #[error("Expected bytes")]
139//!     ExpectedBytes,
140//!     #[error("Expected non-null")]
141//!     ExpectedNonNull,
142//! }
143//! impl<'a, 'frame, 'metadata> DeserializeValue<'frame, 'metadata> for MySlice<'a>
144//! where
145//!     'frame: 'a,
146//! {
147//!     fn type_check(typ: &ColumnType) -> Result<(), TypeCheckError> {
148//!          if let ColumnType::Native(NativeType::Blob) = typ {
149//!              return Ok(());
150//!          }
151//!          Err(TypeCheckError::new(MyDeserError::ExpectedBytes))
152//!      }
153//!
154//!      fn deserialize(
155//!          _typ: &'metadata ColumnType<'metadata>,
156//!          v: Option<FrameSlice<'frame>>,
157//!      ) -> Result<Self, DeserializationError> {
158//!          v.ok_or_else(|| DeserializationError::new(MyDeserError::ExpectedNonNull))
159//!             .map(|v| Self(v.as_slice()))
160//!      }
161//! }
162//! ```
163//!
164//! ## Reference-counted types
165//!
166//! Internally, the driver uses the `bytes::Bytes` type to keep the contents
167//! of the serialized response. It supports creating derived `Bytes` objects
168//! which point to a subslice but keep the whole, original `Bytes` object alive.
169//!
170//! During deserialization, a type can obtain a `Bytes` subslice that points
171//! to the serialized value. This approach combines advantages of the previous
172//! two approaches - creating a derived `Bytes` object can be cheaper than
173//! allocation and a copy (it supports `Arc`-like semantics) and the `Bytes`
174//! type is not constrained by a lifetime. However, you should be aware that
175//! the subslice will keep the whole `Bytes` object that holds the frame alive.
176//! It is not recommended to use this approach for long-living objects because
177//! it can introduce space leaks.
178//!
179//! Example:
180//!
181//! ```rust
182//! # use scylla_cql::frame::response::result::{NativeType, ColumnType};
183//! # use scylla_cql::deserialize::{DeserializationError, FrameSlice, TypeCheckError};
184//! # use scylla_cql::deserialize::value::DeserializeValue;
185//! # use bytes::Bytes;
186//! use thiserror::Error;
187//! struct MyBytes(Bytes);
188//! #[derive(Debug, Error)]
189//! enum MyDeserError {
190//!     #[error("Expected bytes")]
191//!     ExpectedBytes,
192//!     #[error("Expected non-null")]
193//!     ExpectedNonNull,
194//! }
195//! impl<'frame, 'metadata> DeserializeValue<'frame, 'metadata> for MyBytes {
196//!     fn type_check(typ: &ColumnType) -> Result<(), TypeCheckError> {
197//!          if let ColumnType::Native(NativeType::Blob) = typ {
198//!              return Ok(());
199//!          }
200//!          Err(TypeCheckError::new(MyDeserError::ExpectedBytes))
201//!      }
202//!
203//!      fn deserialize(
204//!          _typ: &'metadata ColumnType<'metadata>,
205//!          v: Option<FrameSlice<'frame>>,
206//!      ) -> Result<Self, DeserializationError> {
207//!          v.ok_or_else(|| DeserializationError::new(MyDeserError::ExpectedNonNull))
208//!             .map(|v| Self(v.to_bytes()))
209//!      }
210//! }
211//! ```
212
213pub mod frame_slice;
214pub mod result;
215pub mod row;
216pub mod value;
217
218pub use frame_slice::FrameSlice;
219
220use std::error::Error;
221use std::sync::Arc;
222
223use thiserror::Error;
224
225// Errors
226
227/// An error indicating that a failure happened during type check.
228///
229/// The error is type-erased so that the crate users can define their own
230/// type check impls and their errors.
231/// As for the impls defined or generated
232/// by the driver itself, the following errors can be returned:
233///
234/// - [`row::BuiltinTypeCheckError`] is returned when type check of
235///   one of types with an impl built into the driver fails. It is also returned
236///   from impls generated by the `DeserializeRow` macro.
237/// - [`value::BuiltinTypeCheckError`] is analogous to the above but is
238///   returned from [`DeserializeValue::type_check`](value::DeserializeValue::type_check)
239///   instead both in the case of builtin impls and impls generated by the
240///   `DeserializeValue` macro.
241///   It won't be returned by the `Session` directly, but it might be nested
242///   in the [`row::BuiltinTypeCheckError`].
243#[derive(Debug, Clone, Error)]
244#[error("TypeCheckError: {0}")]
245pub struct TypeCheckError(pub(crate) Arc<dyn std::error::Error + Send + Sync>);
246
247impl TypeCheckError {
248    /// Constructs a new `TypeCheckError`.
249    #[inline]
250    pub fn new(err: impl std::error::Error + Send + Sync + 'static) -> Self {
251        Self(Arc::new(err))
252    }
253
254    /// Retrieve an error reason by downcasting to specific type.
255    pub fn downcast_ref<T: std::error::Error + 'static>(&self) -> Option<&T> {
256        self.0.downcast_ref()
257    }
258}
259
260/// An error indicating that a failure happened during deserialization.
261///
262/// The error is type-erased so that the crate users can define their own
263/// deserialization impls and their errors. As for the impls defined or generated
264/// by the driver itself, the following errors can be returned:
265///
266/// - [`row::BuiltinDeserializationError`] is returned when deserialization of
267///   one of types with an impl built into the driver fails. It is also returned
268///   from impls generated by the `DeserializeRow` macro.
269/// - [`value::BuiltinDeserializationError`] is analogous to the above but is
270///   returned from [`DeserializeValue::deserialize`](value::DeserializeValue::deserialize)
271///   instead both in the case of builtin impls and impls generated by the
272///   `DeserializeValue` macro.
273///   It won't be returned by the `Session` directly, but it might be nested
274///   in the [`row::BuiltinDeserializationError`].
275#[derive(Debug, Clone, Error)]
276#[error("DeserializationError: {0}")]
277pub struct DeserializationError(Arc<dyn Error + Send + Sync>);
278
279impl DeserializationError {
280    /// Constructs a new `DeserializationError`.
281    #[inline]
282    pub fn new(err: impl Error + Send + Sync + 'static) -> Self {
283        Self(Arc::new(err))
284    }
285
286    /// Retrieve an error reason by downcasting to specific type.
287    pub fn downcast_ref<T: Error + 'static>(&self) -> Option<&T> {
288        self.0.downcast_ref()
289    }
290}
291
292// This is a hack to enable setting the proper Rust type name in error messages,
293// even though the error originates from some helper type used underneath.
294// ASSUMPTION: This should be used:
295// - ONLY in proper type_check()/deserialize() implementation,
296// - BEFORE an error is cloned (because otherwise the Arc::get_mut fails).
297macro_rules! make_error_replace_rust_name {
298    ($privacy: vis, $fn_name: ident, $outer_err: ty, $inner_err: ty) => {
299        // Not part of the public API; used in derive macros.
300        #[doc(hidden)]
301        #[allow(clippy::needless_pub_self)]
302        $privacy fn $fn_name<RustT>(mut err: $outer_err) -> $outer_err {
303            // Safety: the assumed usage of this function guarantees that the Arc has not yet been cloned.
304            let arc_mut = std::sync::Arc::get_mut(&mut err.0).unwrap();
305
306            let rust_name: &mut &str = {
307                if let Some(err) = arc_mut.downcast_mut::<$inner_err>() {
308                    &mut err.rust_name
309                } else {
310                    unreachable!(concat!(
311                        "This function is assumed to be called only on built-in ",
312                        stringify!($inner_err),
313                        " kinds."
314                    ))
315                }
316            };
317
318            *rust_name = std::any::type_name::<RustT>();
319            err
320        }
321    };
322}
323use make_error_replace_rust_name;
324
325#[cfg(test)]
326pub(crate) mod tests {
327    use bytes::{Bytes, BytesMut};
328
329    use crate::frame::response::result::{ColumnSpec, ColumnType, TableSpec};
330    use crate::frame::types;
331
332    pub(super) static CELL1: &[u8] = &[1, 2, 3];
333    pub(super) static CELL2: &[u8] = &[4, 5, 6, 7];
334
335    pub(super) fn serialize_cells(
336        cells: impl IntoIterator<Item = Option<impl AsRef<[u8]>>>,
337    ) -> Bytes {
338        let mut bytes = BytesMut::new();
339        for cell in cells {
340            types::write_bytes_opt(cell, &mut bytes).unwrap();
341        }
342        bytes.freeze()
343    }
344
345    pub(crate) const fn spec<'a>(name: &'a str, typ: ColumnType<'a>) -> ColumnSpec<'a> {
346        ColumnSpec::borrowed(name, typ, TableSpec::borrowed("ks", "tbl"))
347    }
348}