class utf8

Mutable uint8 buffer for utf8 binary data

class utf8 does Blob[uint8is repr('VMArray'{}

A utf8 is a subtype of Blob which is specifically uint8 data for holding UTF-8 encoded text.

my utf8 $b = "hello".encode;
say $b[1].fmt("0x%X"); # OUTPUT: «0x65␤»

Type Graph

Type relations for utf8
perl6-type-graph utf8 utf8 Any Any utf8->Any Blob Blob utf8->Blob Mu Mu Any->Mu Positional Positional Stringy Stringy Blob->Positional Blob->Stringy

Expand above chart

Routines supplied by role Blob

utf8 does role Blob, which provides the following routines:

(Blob) method new

Defined as:

multi method new(Blob:)
multi method new(Blob: Blob:D $blob)
multi method new(Blob: int @values)
multi method new(Blob: @values)
multi method new(Blob: *@values)

Creates an empty Blob, or a new Blob from another Blob, or from a list of integers or values (which will have to be coerced into integers):

my $blob = Blob.new([123]);
say Blob.new(<1 2 3>); # OUTPUT: «Blob:0x<01 02 03>␤»

(Blob) method Bool

Defined as:

multi method Bool(Blob:D:)

Returns False if and only if the buffer is empty.

my $blob = Blob.new();
say $blob.Bool# OUTPUT: «False␤» 
$blob = Blob.new([123]);
say $blob.Bool# OUTPUT: «True␤»

(Blob) method Capture

Defined as:

method Capture(Blob:D)

Equivalent to calling .List.Capture on the invocant.

(Blob) method elems

Defined as:

multi method elems(Blob:D:)
multi method elems(Blob:U: --> 1)

Returns the number of elements of the buffer.

my $blob = Blob.new([123]);
say $blob.elems# OUTPUT: «3␤»

It will also return 1 on the class object.

(Blob) method bytes

Defined as:

method bytes(Blob:D: --> Int:D)

Returns the number of bytes used by the elements in the buffer.

say Blob.new([123]).bytes;      # OUTPUT: «3␤» 
say blob16.new([123]).bytes;    # OUTPUT: «6␤» 
say blob64.new([123]).bytes;    # OUTPUT: «24␤»

(Blob) method chars

Defined as:

method chars(Blob:D:)

Throws X::Buf::AsStr with chars as payload.

(Blob) method Str

Defined as:

multi method Str(Blob:D:)

Throws X::Buf::AsStr with Str as payload. In order to convert to a Str you need to use .decode.

(Blob) method Stringy

Defined as:

multi method Stringy(Blob:D:)

Throws X::Buf::AsStr with Stringy as payload.

(Blob) method decode

Defined as:

multi method decode(Blob:D: $encoding = self.encoding // "utf-8")
multi method decode(Blob:D: $encodingStr :$replacement!,
                    Bool:D :$strict = False)
multi method decode(Blob:D: $encodingBool:D :$strict = False)

Applies an encoding to turn the blob into a Str; the encoding will be UTF-8 by default.

my Blob $blob = "string".encode('utf-8');
say $blob.decode('utf-8'); # OUTPUT: «string␤»

On malformed utf-8 .decode will throw X::AdHoc. To handle sloppy utf-8 use utf8-c8.

(Blob) method list

Defined as:

multi method list(Blob:D:)

Returns the list of codepoints:

say "zipi".encode("ascii").list# OUTPUT: «(122 105 112 105)␤»

(Blob) method gist

Defined as:

method gist(Blob:D: --> Str:D)

Returns the string containing the "gist" of the Blob, listing up to the first 100 elements, separated by space, appending an ellipsis if the Blob has more than 100 elements.

put Blob.new(123).gist# OUTPUT: «Blob:0x<01 02 03>␤» 
put Blob.new(1..2000).gist;
# OUTPUT: 
# Blob:0x<01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 10 11 12 13 14 15 
# 16 17 18 19 1A 1B 1C 1D 1E 1F 20 21 22 23 24 25 26 27 28 29 2A 2B 2C 
# 2D 2E 2F 30 31 32 33 34 35 36 37 38 39 3A 3B 3C 3D 3E 3F 40 41 42 43 
# 44 45 46 47 48 49 4A 4B 4C 4D 4E 4F 50 51 52 53 54 55 56 57 58 59 5A 
# 5B 5C 5D 5E 5F 60 61 62 63 64 ...>

(Blob) method subbuf

Defined as:

multi method subbuf(Int $fromInt $len = self.elems --> Blob:D)
multi method subbuf(Range $range --> Blob:D)

Extracts a part of the invocant buffer, starting from the index with elements $from, and taking $len elements (or less if the buffer is shorter), and creates a new buffer as the result.

say Blob.new(1..10).subbuf(24);    # OUTPUT: «Blob:0x<03 04 05 06>␤» 
say Blob.new(1..10).subbuf(*-2);     # OUTPUT: «Blob:0x<09 0a>␤» 
say Blob.new(1..10).subbuf(*-5,2);   # OUTPUT: «Blob:0x<06 07>␤»

For convenience, also allows a Range to be specified to indicate which part of the invocant buffer you would like:

say Blob.new(1..10).subbuf(2..5);    # OUTPUT: «Blob:0x<03 04 05 06>␤»

(Blob) method allocate

Defined as:

multi method allocate(Blob:U: Int:D $elements)
multi method allocate(Blob:U: Int:D $elementsint $value)
multi method allocate(Blob:U: Int:D $elementsInt:D \value)
multi method allocate(Blob:U: Int:D $elementsMu:D $got)
multi method allocate(Blob:U: Int:D $elementsint @values)
multi method allocate(Blob:U: Int:D $elementsBlob:D $blob)
multi method allocate(Blob:U: Int:D $elements@values)

Returns a newly created Blob object with the given number of elements. Optionally takes a second argument that indicates the pattern with which to fill the Blob: this can be a single (possibly native) integer value, or any Iterable that generates integer values, including another Blob. The pattern will be repeated if not enough values are given to fill the entire Blob.

my Blob $b0 = Blob.allocate(10,0);
$b0.say# OUTPUT: «Blob:0x<00 00 00 00 00 00 00 00 00 00>␤»

If the pattern is a general Mu value, it will fail.

(Blob) method unpack

This method is considered experimental, in order to use it you will need to do:

use experimental :pack;

Defined as:

multi method unpack(Blob:D: Str:D $template)
multi method unpack(Blob:D: @template)
multi sub unpack(Blob:D \blobStr:D $template)
multi sub unpack(Blob:D \blob@template)

Extracts features from the blob according to the template string, and returns them as a list.

The template string consists of zero or more units that begin with an ASCII letter, and are optionally followed by a quantifier. The quantifier can be * (which typically stands for "use up the rest of the Blob here"), or a positive integer (without a +).

Whitespace between template units is ignored.

Examples of valid templates include "A4 C n*" and "A*".

The following letters are recognized:

Letter Meaning
A Extract a string, where each element of the Blob maps to a codepoint
a Same as A
C Extract an element from the blob as an integer
H Extracts a hex string
L Extracts four elements and returns them as a single unsigned integer
n Extracts two elements and combines them in "network" (BigEndian) byte order into a single integer
N Extracts four elements and combines them in "network" (BigEndian) byte order into a single integer
S Extracts two elements and returns them as a single unsigned integer
v Same as S
V Same as L
x Drop an element from the blob (that is, ignore it)
Z Same as A

Example:

use experimental :pack;
say Blob.new(1..10).unpack("C*");
# OUTPUT: «(1 2 3 4 5 6 7 8 9 10)␤»

(Blob) sub pack

This subroutine is considered experimental, in order to use it you will need to do:

use experimental :pack;
multi sub pack(Str $template*@items)
multi sub pack(@template*@items)

Packs the given items according to the template and returns a buffer containing the packed bytes.

The template string consists of zero or more units that begin with an ASCII letter, and are optionally followed by a quantifier. For details, see unpack.

(Blob) method reverse

Defined as:

method reverse(Blob:D: --> Blob:D)

Returns a Blob with all elements in reversed order.

say Blob.new([123]).reverse;    # OUTPUT: «Blob:0x<03 02 01>␤» 
say blob16.new([2]).reverse;        # OUTPUT: «Blob[uint16]:0x<02>␤» 
say buf32.new([1632]).reverse;    # OUTPUT: «Buf[uint32]:0x<20 10>␤»

(Blob) method read-uint8

Defined as:

method read-uint8(blob8:D: uint $pos$endian = NativeEndian --> uint)

Returns an unsigned native integer value for the byte at the given position. The $endian parameter has no meaning, but is available for consistency.

(Blob) method read-int8

Defined as:

method read-int8(blob8:D: uint $pos$endian = NativeEndian --> int)

Returns a native int value for the byte at the given position. The $endian parameter has no meaning, but is available for consistency.

(Blob) method read-uint16

Defined as:

method read-uint16(blob8:D: uint $pos$endian = NativeEndian --> uint)

Returns a native uint value for the two bytes starting at the given position.

(Blob) method read-int16

Defined as:

method read-int16(blob8:D: uint $pos$endian = NativeEndian --> int)

Returns a native int value for the two bytes starting at the given position.

(Blob) method read-uint32

Defined as:

method read-uint32(blob8:D: uint $pos$endian = NativeEndian --> uint)

Returns a native uint value for the four bytes starting at the given position.

(Blob) method read-int32

Defined as:

method read-int32(blob8:D: uint $pos$endian = NativeEndian --> int)

Returns a native int value for the four bytes starting at the given position.

(Blob) method read-uint64

Defined as:

method read-uint64(blob8:D: uint $pos$endian = NativeEndian --> UInt:D)

Returns an unsigned integer value for the eight bytes starting at the given position.

(Blob) method read-int64

Defined as:

method read-int64(blob8:D: uint $pos$endian = NativeEndian --> int)

Returns a native int value for the eight bytes starting at the given position.

(Blob) method read-uint128

Defined as:

method read-uint128(blob8:D: uint $pos$endian = NativeEndian --> UInt:D)

Returns an unsigned integer value for the sixteen bytes starting at the given position.

(Blob) method read-int128

Defined as:

method read-int128(blob8:D: uint $pos$endian = NativeEndian --> Int:D)

Returns an integer value for the sixteen bytes starting at the given position.

(Blob) method read-num32

Defined as:

method read-num32(blob8:D: uint $pos$endian = NativeEndian --> int)

Returns a native num value for the four bytes starting at the given position.

(Blob) method read-num64

Defined as:

method read-num64(blob8:D: uint $pos$endian = NativeEndian --> int)

Returns a native num value for the eight bytes starting at the given position.

(Blob) method read-ubits

Defined as:

method read-ubits(blob8:D: uint $posuint $bits --> UInt:D)

Returns an unsigned integer value for the bits from the given bit offset and given number of bits. The endianness of the bits is assumed to be BigEndian.

(Blob) method read-bits

Defined as:

method read-bits(blob8:D: uint $posuint $bits --> Int:D)

Returns a signed integer value for the bits from the given bit offset and given number of bits. The endianness of the bits is assumed to be BigEndian.

Routines supplied by role Positional

utf8 does role Positional, which provides the following routines:

(Positional) method of

method of()

Returns the type constraint for elements of the positional container. Defaults to Mu.

(Positional) method elems

method elems()

Should return the number of available elements in the instantiated object.

(Positional) method AT-POS

method AT-POS(\position)

Should return the value / container at the given position.

(Positional) method EXISTS-POS

method EXISTS-POS(\position)

Should return a Bool indicating whether the given position actually has a value.

(Positional) method STORE

method STORE(\values:$initialize)

This method should only be supplied if you want to support the:

my @a is Foo = 1,2,3;

syntax for binding your implementation of the Positional role.

Should accept the values to (re-)initialize the object with. The optional named parameter will contain a True value when the method is called on the object for the first time. Should return the invocant.