内容简介:By default, Swift is memory safe: It prevents direct access to memory and makes sure you’ve initialized everything before you use it. The key phrase is “by default.” You can also useThis tutorial will take you on a whirlwind tour of the so-called
Update note : Brody Eller updated this tutorial for Swift 5.1. Ray Fix wrote the original.
By default, Swift is memory safe: It prevents direct access to memory and makes sure you’ve initialized everything before you use it. The key phrase is “by default.” You can also use unsafe Swift , which lets you access memory directly through pointers.
This tutorial will take you on a whirlwind tour of the so-called unsafe features of Swift.
Unsafe doesn’t mean dangerously bad code that might not work. Instead, it refers to code that needs extra care because it limits how the compiler can protect you from making mistakes.
These features are useful if you interoperate with an unsafe language such as C, need to gain additional runtime performance or simply want to explore the internals of Swift. In this tutorial, you’ll learn how to use pointers and interact with the memory system directly.
Note : While this is an advanced topic, you’ll be able to follow along if you have reasonable competency in Swift. If you need to brush up on your skills, please check out our iOS and Swift for Beginners series. C experience is beneficial but not necessary.
Getting Started
Download the begin project by clicking the Download Materials button at the top or bottom of the tutorial.
This tutorial consists of three empty Swift playgrounds:
arc4random
to generate random numbers. It uses unsafe Swift, but hides that detail from users. Start by opening the UnsafeSwift playground. Since all the code in this tutorial is platform-agnostic, you may select any platform.
Exploring Memory Layout With Unsafe Swift
Unsafe Swift works directly with the memory system. You can visualize memory as a series of boxes — billions of boxes, actually — each containing a number.
Each box has a unique memory address . The smallest addressable unit of storage is a byte , which usually consists of eight bits .
Eight-bit bytes can store values from 0-255. Processors can also efficiently access words of memory, which are typically more than one byte.
On a 64-bit system, for example, a word is 8 bytes or 64 bits. To see this in action, you’ll use MemoryLayout
to tell you the size and alignment of components of some native Swift types.
Add the following to your playground:
import Foundation MemoryLayout<Int>.size // returns 8 (on 64-bit) MemoryLayout<Int>.alignment // returns 8 (on 64-bit) MemoryLayout<Int>.stride // returns 8 (on 64-bit) MemoryLayout<Int16>.size // returns 2 MemoryLayout<Int16>.alignment // returns 2 MemoryLayout<Int16>.stride // returns 2 MemoryLayout<Bool>.size // returns 1 MemoryLayout<Bool>.alignment // returns 1 MemoryLayout<Bool>.stride // returns 1 MemoryLayout<Float>.size // returns 4 MemoryLayout<Float>.alignment // returns 4 MemoryLayout<Float>.stride // returns 4 MemoryLayout<Double>.size // returns 8 MemoryLayout<Double>.alignment // returns 8 MemoryLayout<Double>.stride // returns 8
MemoryLayout<Type>
is a generic type evaluated at compile time. It determines the size
, alignment
and stride
of each specified Type
and returns a number in bytes.
For example, an Int16
is two bytes in size
and has an alignment
of two as well. That means it has to start on even addresses — that is, addresses divisible by two.
For example, it’s legal to allocate an Int16
at address 100, but not at 101 — an odd number violates the required alignment.
When you pack a bunch of Int16
s together, they pack at an interval of stride
. For these basic types, the size
is the same as the stride
.
Examining Struct Layouts
Next, look at the layout of some user-defined struct
s by adding the following to the playground:
struct EmptyStruct {} MemoryLayout<EmptyStruct>.size // returns 0 MemoryLayout<EmptyStruct>.alignment // returns 1 MemoryLayout<EmptyStruct>.stride // returns 1 struct SampleStruct { let number: UInt32 let flag: Bool } MemoryLayout<SampleStruct>.size // returns 5 MemoryLayout<SampleStruct>.alignment // returns 4 MemoryLayout<SampleStruct>.stride // returns 8
The empty structure has a size of zero. It can exist at any address since alignment
is one and all numbers are evenly divisible by one.
The stride
, curiously, is one. That’s because each EmptyStruct
you create has to have a unique memory address, even though its size is zero.
For SampleStruct
, the size
is five but the stride
is eight. That’s because its alignment requires it to be on 4-byte boundaries. Given that, the best Swift can do is pack at an interval of eight bytes.
To see how the layout differs for class
versus struct
, add the following:
class EmptyClass {} MemoryLayout<EmptyClass>.size // returns 8 (on 64-bit) MemoryLayout<EmptyClass>.stride // returns 8 (on 64-bit) MemoryLayout<EmptyClass>.alignment // returns 8 (on 64-bit) class SampleClass { let number: Int64 = 0 let flag = false } MemoryLayout<SampleClass>.size // returns 8 (on 64-bit) MemoryLayout<SampleClass>.stride // returns 8 (on 64-bit) MemoryLayout<SampleClass>.alignment // returns 8 (on 64-bit)
Classes are reference types, so MemoryLayout
reports the size of a reference: Eight bytes.
If you want to explore memory layout in greater detail, check out Mike Ash’s excellent talk, Exploring Swift Memory Layout .
Using Pointers in Unsafe Swift
A pointer encapsulates a memory address.
Types that involve direct memory access get an unsafe prefix, so the pointer type name is UnsafePointer
.
The extra typing may seem annoying, but it reminds you that you’re accessing memory that the compiler isn’t checking. When done incorrectly, this could lead to undefined behavior , not just a predictable crash.
Swift doesn’t offer just a single UnsafePointer
type that accesses memory in an unstructured way, like C’s char *
does. Swift contains almost a dozen pointer types, each with different capabilities and purposes.
You always want to use the most appropriate pointer type for your purpose. This communicates intent better, is less error-prone and avoids undefined behavior.
Unsafe Swift pointers use a predictable naming scheme that describes the pointers’ traits: mutable or immutable, raw or typed, buffer style or not. In total, there are eight pointer combinations. You’ll learn more about them in the following sections.
Using Raw Pointers
In this section, you’ll use unsafe Swift pointers to store and load two integers. Add the following code to your playground:
// 1 let count = 2 let stride = MemoryLayout<Int>.stride let alignment = MemoryLayout<Int>.alignment let byteCount = stride * count // 2 do { print("Raw pointers") // 3 let pointer = UnsafeMutableRawPointer.allocate( byteCount: byteCount, alignment: alignment) // 4 defer { pointer.deallocate() } // 5 pointer.storeBytes(of: 42, as: Int.self) pointer.advanced(by: stride).storeBytes(of: 6, as: Int.self) pointer.load(as: Int.self) pointer.advanced(by: stride).load(as: Int.self) // 6 let bufferPointer = UnsafeRawBufferPointer(start: pointer, count: byteCount) for (index, byte) in bufferPointer.enumerated() { print("byte \(index): \(byte)") } }
Here’s what’s going on:
- These constants hold frequently used values:
- Count holds the number of integers to store.
- Stride holds the stride of type
Int
. - Alignment holds the alignment of type
Int
. - ByteCount holds the total number of bytes needed.
- A
do
block adds a scope level, so you can reuse the variable names in upcoming examples. -
UnsafeMutableRawPointer.allocate
allocates the required bytes. This method returns anUnsafeMutableRawPointer
. The name of that type tells you the pointer can load and store, or mutate , raw bytes. - A
defer
block makes sure you deallocate the pointer properly. ARC isn’t going to help you here — you need to handle memory management yourself! You can read more about defer statements in the official Swift documentation. -
storeBytes
andload
, unsurprisingly, store and load bytes. You calculate the memory address of the second integer by advancing the pointerstride
bytes. Since pointers areStrideable
, you can also use pointer arithmetic like:(pointer+stride).storeBytes(of: 6, as: Int.self)
. - An
UnsafeRawBufferPointer
lets you access memory as if it were a collection of bytes. This means you can iterate over the bytes and access them using subscripting. You can also use cool methods likefilter
,map
andreduce
. You initialize the buffer pointer using the raw pointer.
Even though UnsafeRawBufferPointer
is unsafe, you can still make it safer by constraining it to specific types.
Using Typed Pointers
You can simplify the previous example by using typed pointers. Add the following code to your playground:
do { print("Typed pointers") let pointer = UnsafeMutablePointer<Int>.allocate(capacity: count) pointer.initialize(repeating: 0, count: count) defer { pointer.deinitialize(count: count) pointer.deallocate() } pointer.pointee = 42 pointer.advanced(by: 1).pointee = 6 pointer.pointee pointer.advanced(by: 1).pointee let bufferPointer = UnsafeBufferPointer(start: pointer, count: count) for (index, value) in bufferPointer.enumerated() { print("value \(index): \(value)") } }
Notice the following differences:
- You allocate memory using
UnsafeMutablePointer
. The generic parameter lets Swift know you’re using the pointer to load and store values of type.allocate Int
. - You must initialize typed memory before use and deinitialize it after use. You do this using the
initialize
anddeinitialize
methods, respectively. Deinitialization is only required for non-trivial types . However, including deinitialization is a good way to future-proof your code in case you change to something non-trivial. It usually doesn’t cost anything since the compiler will optimize it out. - Typed pointers have a
pointee
property that provides a type-safe way to load and store values. - When advancing a typed pointer, you can simply state the number of values you want to advance. The pointer can calculate the correct stride based on the type of values it points to. Again, pointer arithmetic also works. You can also say
(pointer+1).pointee = 6
- The same holds true for typed buffer pointers: They iterate over values instead of bytes.
Next, you’ll learn how to go from unconstrained UnsafeRawBufferPointer
to safer, type constrained UnsafeRawBufferPointer
.
Converting Raw Pointers to Typed Pointers
You don’t always need to initialize typed pointers directly. You can derive them from raw pointers as well.
Add the following code to your playground:
do { print("Converting raw pointers to typed pointers") let rawPointer = UnsafeMutableRawPointer.allocate( byteCount: byteCount, alignment: alignment) defer { rawPointer.deallocate() } let typedPointer = rawPointer.bindMemory(to: Int.self, capacity: count) typedPointer.initialize(repeating: 0, count: count) defer { typedPointer.deinitialize(count: count) } typedPointer.pointee = 42 typedPointer.advanced(by: 1).pointee = 6 typedPointer.pointee typedPointer.advanced(by: 1).pointee let bufferPointer = UnsafeBufferPointer(start: typedPointer, count: count) for (index, value) in bufferPointer.enumerated() { print("value \(index): \(value)") } }
This example is similar to the previous one, except that it first creates a raw pointer. You create the typed pointer by binding the memory to the required type Int
.
By binding memory, you can access it in a type-safe way. Memory binding goes on behind the scenes when you create a typed pointer.
The rest of this example is also the same as the previous one. Once you’re in typed pointer land, you can make use of pointee
, for example.
Getting the Bytes of an Instance
Often, you have an existing instance of a type and you want to inspect the bytes that form it. You can achieve this using a method called withUnsafeBytes(of:)
.
To do so, add the following code to your playground:
do { print("Getting the bytes of an instance") var sampleStruct = SampleStruct(number: 25, flag: true) withUnsafeBytes(of: &sampleStruct) { bytes in for byte in bytes { print(byte) } } }
This prints out the raw bytes of the SampleStruct
instance.
withUnsafeBytes(of:)
gives you access to an UnsafeRawBufferPointer
that you can use inside the closure.
withUnsafeBytes
is also available as an instance method on Array
and Data
.
Computing a Checksum
Using withUnsafeBytes(of:)
, you can return a result. For example, you might use this to compute a 32-bit checksum of the bytes in a structure.
Add the following code to your playground:
do { print("Checksum the bytes of a struct") var sampleStruct = SampleStruct(number: 25, flag: true) let checksum = withUnsafeBytes(of: &sampleStruct) { (bytes) -> UInt32 in return ~bytes.reduce(UInt32(0)) { $0 + numericCast($1) } } print("checksum", checksum) // prints checksum 4294967269 }
The reduce
call adds the bytes, then ~
flips the bits. While not the most robust error detection, it shows the concept.
Now that you know how to use unsafe Swift, it’s time to learn some things you should absolutely not do with it.
Three Rules of the Unsafe Club
Be careful to avoid undefined behavior when writing unsafe code. Here are a few examples of bad code:
Don’t Return the Pointer From withUnsafeBytes!
// Rule #1 do { print("1. Don't return the pointer from withUnsafeBytes!") var sampleStruct = SampleStruct(number: 25, flag: true) let bytes = withUnsafeBytes(of: &sampleStruct) { bytes in return bytes // strange bugs here we come ☠️☠️☠️ } print("Horse is out of the barn!", bytes) // undefined!!! }
You should never let the pointer escape the withUnsafeBytes(of:)
closure. Even if your code works today, it may cause strange bugs in the future.
Only Bind to One Type at a Time!
// Rule #2 do { print("2. Only bind to one type at a time!") let count = 3 let stride = MemoryLayout<Int16>.stride let alignment = MemoryLayout<Int16>.alignment let byteCount = count * stride let pointer = UnsafeMutableRawPointer.allocate( byteCount: byteCount, alignment: alignment) let typedPointer1 = pointer.bindMemory(to: UInt16.self, capacity: count) // Breakin' the Law... Breakin' the Law (Undefined behavior) let typedPointer2 = pointer.bindMemory(to: Bool.self, capacity: count * 2) // If you must, do it this way: typedPointer1.withMemoryRebound(to: Bool.self, capacity: count * 2) { (boolPointer: UnsafeMutablePointer<Bool>) in print(boolPointer.pointee) // See Rule #1, don't return the pointer } }
Never bind memory to two unrelated types at once. This is called Type Punning and Swift does not like puns. :]
Instead, temporarily rebind memory with a method like withMemoryRebound(to:capacity:)
.
Also, it is illegal to rebind from a trivial type , such as an Int
, to a non-trivial type, such as a class
. Don’t do it.
Don’t Walk Off the End… Whoops!
// Rule #3... wait do { print("3. Don't walk off the end... whoops!") let count = 3 let stride = MemoryLayout<Int16>.stride let alignment = MemoryLayout<Int16>.alignment let byteCount = count * stride let pointer = UnsafeMutableRawPointer.allocate( byteCount: byteCount, alignment: alignment) let bufferPointer = UnsafeRawBufferPointer(start: pointer, count: byteCount + 1) // OMG +1???? for byte in bufferPointer { print(byte) // pawing through memory like an animal } }
The ever-present problem of off-by-one errors becomes even worse with unsafe code. Be careful, review and test!
Unsafe Swift Example 1: Compression
Time to take all your knowledge and use it to wrap a C API. Cocoa includes a C module that implements some common data compression algorithms. These include:
- LZ4 for when speed is critical.
- LZ4A for when you need the highest compression ratio and don’t care about speed.
- ZLIB , which balances space and speed.
- The new, open-source LZFSE , which does an even better job balancing space and speed.
Now, open the Compression playground in the begin project.
First, you’ll define a pure Swift API using Data
by replacing the contents of your playground with the following code:
import Foundation import Compression enum CompressionAlgorithm { case lz4 // speed is critical case lz4a // space is critical case zlib // reasonable speed and space case lzfse // better speed and space } enum CompressionOperation { case compression, decompression } /// return compressed or uncompressed data depending on the operation func perform( _ operation: CompressionOperation, on input: Data, using algorithm: CompressionAlgorithm, workingBufferSize: Int = 2000) -> Data? { return nil }
The function that does the compression and decompression is perform
, which is currently stubbed out to return nil
. You’ll add some unsafe code to it shortly.
Next, add the following code to the end of the playground:
/// Compressed keeps the compressed data and the algorithm /// together as one unit, so you never forget how the data was /// compressed. struct Compressed { let data: Data let algorithm: CompressionAlgorithm init(data: Data, algorithm: CompressionAlgorithm) { self.data = data self.algorithm = algorithm } /// Compresses the input with the specified algorithm. Returns nil if it fails. static func compress( input: Data,with algorithm: CompressionAlgorithm) -> Compressed? { guard let data = perform(.compression, on: input, using: algorithm) else { return nil } return Compressed(data: data, algorithm: algorithm) } /// Uncompressed data. Returns nil if the data cannot be decompressed. func decompressed() -> Data? { return perform(.decompression, on: data, using: algorithm) } }
The Compressed
structure stores both the compressed data and the algorithm used to create it. That makes it less error-prone when deciding what decompression algorithm to use.
Next, add the following code to the end of the playground:
/// For discoverability, adds a compressed method to Data extension Data { /// Returns compressed data or nil if compression fails. func compressed(with algorithm: CompressionAlgorithm) -> Compressed? { return Compressed.compress(input: self, with: algorithm) } } // Example usage: let input = Data(Array(repeating: UInt8(123), count: 10000)) let compressed = input.compressed(with: .lzfse) compressed?.data.count // in most cases much less than original input count let restoredInput = compressed?.decompressed() input == restoredInput // true
The main entry point is an extension on the Data
type. You’ve added a method called compressed(with:)
which returns an optional Compressed
struct. This method simply calls the static method compress(input:with:)
on Compressed
.
There’s an example at the end, but it’s currently not working. Time to fix that!
Scroll up to the first block of code you entered and begin the implementation of perform(_:on:using:workingBufferSize:)
inserting the following before return nil
:
// set the algorithm let streamAlgorithm: compression_algorithm switch algorithm { case .lz4: streamAlgorithm = COMPRESSION_LZ4 case .lz4a: streamAlgorithm = COMPRESSION_LZMA case .zlib: streamAlgorithm = COMPRESSION_ZLIB case .lzfse: streamAlgorithm = COMPRESSION_LZFSE } // set the stream operation and flags let streamOperation: compression_stream_operation let flags: Int32 switch operation { case .compression: streamOperation = COMPRESSION_STREAM_ENCODE flags = Int32(COMPRESSION_STREAM_FINALIZE.rawValue) case .decompression: streamOperation = COMPRESSION_STREAM_DECODE flags = 0 }
This converts your Swift types to the C types required for the compression algorithm.
Next, replace return nil
with:
// 1: create a stream var streamPointer = UnsafeMutablePointer<compression_stream>.allocate(capacity: 1) defer { streamPointer.deallocate() } // 2: initialize the stream var stream = streamPointer.pointee var status = compression_stream_init(&stream, streamOperation, streamAlgorithm) guard status != COMPRESSION_STATUS_ERROR else { return nil } defer { compression_stream_destroy(&stream) } // 3: set up a destination buffer let dstSize = workingBufferSize let dstPointer = UnsafeMutablePointer<UInt8>.allocate(capacity: dstSize) defer { dstPointer.deallocate() } return nil // To be continued
Here’s what’s happening:
- Allocate a
compression_stream
and schedule it for deallocation with thedefer
block. - Then, using the
pointee
property, you get the stream and pass it to thecompression_stream_init
function.The compiler is doing something special here: It’s using the in-out
&
marker to take yourcompression_stream
and turn it into anUnsafeMutablePointer<compression_stream>
. Alternatively, you could have passedstreamPointer
. Then you wouldn’t need this special conversion. - Finally, you create a destination buffer to act as your working buffer.
Next, finish perform
by replacing the final return nil
with:
// process the input return input.withUnsafeBytes { srcRawBufferPointer in // 1 var output = Data() // 2 let srcBufferPointer = srcRawBufferPointer.bindMemory(to: UInt8.self) guard let srcPointer = srcBufferPointer.baseAddress else { return nil } stream.src_ptr = srcPointer stream.src_size = input.count stream.dst_ptr = dstPointer stream.dst_size = dstSize // 3 while status == COMPRESSION_STATUS_OK { // process the stream status = compression_stream_process(&stream, flags) // collect bytes from the stream and reset switch status { case COMPRESSION_STATUS_OK: // 4 output.append(dstPointer, count: dstSize) stream.dst_ptr = dstPointer stream.dst_size = dstSize case COMPRESSION_STATUS_ERROR: return nil case COMPRESSION_STATUS_END: // 5 output.append(dstPointer, count: stream.dst_ptr - dstPointer) default: fatalError() } } return output }
This is where the work really happens. And here’s what it’s doing:
- Create a
Data
object which will contain the output — the compressed or decompressed data, depending on what operation this is. - Set up the source and destination buffers with the pointers you allocated and their sizes.
- Here, you keep calling
compression_stream_process
as long as it returnsCOMPRESSION_STATUS_OK
. - You then copy the destination buffer into
output
that’s eventually returned from this function. - When the last packet comes in, marked with
COMPRESSION_STATUS_END
, you potentially only need to copy part of the destination buffer.
In this example, you can see that the 10,000-element array gets compressed down to 153 bytes. Not too shabby.
Unsafe Swift Example 2: Random Generator
Random numbers are important for many applications, from games to machine learning.
macOS provides arc4random
, that produces cryptographically-sound random numbers. Unfortunately, this call is not available on Linux. Moreover, arc4random
only provides randoms as UInt32
. However, /dev/urandom provides an unlimited source of good random numbers.
In this section, you’ll use your new knowledge to read this file and create type-safe random numbers.
Start by creating a new playground, calling it RandomNumbers , or by opening the playground in the begin project.
Make sure to select the macOS platform this time.
Once ready, replace the default contents with:
import Foundation enum RandomSource { static let file = fopen("/dev/urandom", "r")! static let queue = DispatchQueue(label: "random") static func get(count: Int) -> [Int8] { let capacity = count + 1 // fgets adds null termination var data = UnsafeMutablePointer<Int8>.allocate(capacity: capacity) defer { data.deallocate() } queue.sync { fgets(data, Int32(capacity), file) } return Array(UnsafeMutableBufferPointer(start: data, count: count)) } }
You declare the file
variable static
so only one will exist in the system. You’ll rely on the system closing it when the process exits.
Since multiple threads may want random numbers, you need to protect access to it with a serial GCD queue.
The get
function is where the work happens.
First, create unallocated storage that is one beyond what you need because fgets
is always 0 terminated.
Next, get the data from the file, making sure to do so while operating on the GCD queue.
Finally, copy the data to a standard array by first wrapping it in a UnsafeMutableBufferPointer
that can act as a Sequence
.
So far, this will only safely give you an array of Int8
values. Now you’re going to extend that.
Add the following to the end of your playground:
extension BinaryInteger { static var randomized: Self { let numbers = RandomSource.get(count: MemoryLayout<Self>.size) return numbers.withUnsafeBufferPointer { bufferPointer in return bufferPointer.baseAddress!.withMemoryRebound( to: Self.self, capacity: 1) { return $0.pointee } } } } Int8.randomized UInt8.randomized Int16.randomized UInt16.randomized Int16.randomized UInt32.randomized Int64.randomized UInt64.randomized
This adds a static randomized
property to all subtypes of the BinaryInteger
protocol. For more on this, check out our tutorial on protocol-oriented programming .
First, you get the random numbers. With the bytes of the array that get returned, you then rebind the Int8
values as the type requested and return a copy.
And that’s it! You’re now generating random numbers in a safe way, using unsafe Swift under the hood.
Where to Go From Here?
Congratulations on finishing this tutorial! You can download the completed project files at the top or bottom of this tutorial using the Download Materials .
There are many additional resources you can explore to learn more about using unsafe Swift:
- Swift Evolution 0107: UnsafeRawPointer API gives a detailed overview of the Swift memory model and makes reading the API documents more understandable.
- Swift Evolution 0138: UnsafeRawBufferPointer API talks extensively about working with untyped memory and has links to open-source projects that benefit from using them.
- Imported C and Objective-C APIs will give you insights about how Swift interacts with C.
I hope you’ve enjoyed this tutorial. If you have questions or experiences you would like to share, feel free to share them in the forums!
以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
Introduction to Semi-Supervised Learning
Xiaojin Zhu、Andrew B. Goldberg / Morgan and Claypool Publishers / 2009-6-29 / USD 40.00
Semi-supervised learning is a learning paradigm concerned with the study of how computers and natural systems such as humans learn in the presence of both labeled and unlabeled data. Traditionally, le......一起来看看 《Introduction to Semi-Supervised Learning》 这本书的介绍吧!