That's Good Enough-www.andy-pearce.com

That's Good Enough

Time 2020-10-23 17:03:48

Web Name: That's Good Enough

WebSite: http://www.andy-pearce.com

ID:91884

Keywords:

That,Good,Enough,

Description:

series which started with Uncovering Rust: References andOwnership. Rust is fairly new multi-paradigm system programming language that claims to offer both high performance and strong safety guarantees, particularly around concurrency and memory allocation. As I play with the language a little, I m using this series of blog posts to discuss some of its more unique features as I come across them. This one discusses Rust s data types and powerful matchoperator. There are a few features you expect from any mainstream imperative programming language.One of them is some support for basic builtin types, such as integers and floats. Anotheris some sort of structured data type, where you can assign values to named fields. Yetanother is some sort of vector, array or list for sequences ofvalues.We re going to start this post by looking at how these standard features manifest inRust. Some of this will be quite familiar to programmers from C++ and similar languages,but there are a few surprises along the way and my main aim is to discussthose.ScalarTypesRust has builtin scalar types for integers, floats, booleans andcharacters.Due to Rust s low-level nature, you generally have to be explicit about the sizes ofthese. There are integral types for 8-, 16-, 32-, 64- and 128-bit values, bothsigned and unsigned. For example i32 is a signed 32-bit integer, u128 is anunsigned 128-bit integer. There are also architecture-dependent types isizeand usize which use the native word size of the machine. These are typically usedfor array offsets. Floats can be f32 for single-precision and f64 fordouble.One point that s worth noting here is that Rust is a strongly typed language andwon t generally perform implicit casts for you, even for numeric types. For example,you can t assign or compare integers with floats, or even integers of differentsizes without doing an explicit conversion. This keeps costs explicit, but it doesmean programmers need to consider their types carefully; but that s no bad thingin my humbleopinion.Specifically on the topic of integers it s also worth noting that Rust will panic(terminate the execution) if you overflow your integer size, but only in a debugbuild. If you compile a release build, the overflow is instead allowed to wraparound. However, the clear intention is that programmers shouldn t be relying onsuch tricks to write safe and portablecode.Types of bool can be true or false. Even Rust hasn t managed to introduceanything surprising or unconventional about booleans! One point ofinterest is that the expression in an if statement has to be a bool. Onceagain there are no implicit conversions, and there is no assumption of equivalencebetween, say, false and 0 as there is inC++.The final type char has a slight surprise waiting for us, which is that ithas a size of four bytes and can represent any Unicode code point. It s great to seeUnicode support front and centre in the language like this, hopefully makingit very difficult for people who want to assume that the world is ASCII. Thoseof you familiar with Unicode may also know that the concept of what constitutesa character may surprise those who are used to working only with ASCII, sothere could be puzzled programmers out there at times. But we live in a globalisedworld now and there s no long any excuse for any self-respecting programmer to writeASCII-firstcode.ArraysRust arrays are homogeneous (each array contains values of only one type) andare of a fixed-size, which must be known at compile time. They are alwaysstored on the stack. Rust does provide a more dynamic Vec type which usesthe heap and allows resizing, but I m not going to discuss thathere.In the interests of safety, Rust requires that every element of an array beinitialise when constructed. Because of this, it s usually not required tospecify a type, but of course there is a syntax for doing so. It s alsopossible to initialise every item to the same value using a shorthand. Theseare all illustrated in the examplebelow.14// These two are equivalent, due to type inference.let numbers1 = [9, 9, 9, 9, 9];let numbers2: [i32; 5] = [9, 9, 9, 9, 9];let numbers3 = [9; 5]; // Repeated value shorthand.Although the size of the array must be known at compile-time, of course the compilercan t police your accesses to the array. For example, you may access an item basedon user input. Rust does do bounds-checking at runtime, however, Discussion of howto handle runtime errors like this is a topic for another time, but the defaultaction will be to terminate the executableimmediately.Structures andTuplesThe basic mechanics of structs in Rust work quite analogously to those in C++, asidefrom some minor syntactic differences. Here s a definition toillustrate:17struct Contact { first_name: String, last_name: String, email: String, age: u8, business: bool,To create an instance of a struct the syntax is similar except providing valuesinstead of types after the colons. After creation the dot notation to read andassign struct fields will also be familiar to both C++ and Pythonprogrammers: 114fn main() { let mut contact1 = Contact { first_name: String::from( John ), last_name: String::from( Doe ), email: String::from( jdoe@example.com ), age: 21, business: false, println!( Contact name is {} {} , contact1.first_name, contact1.last_name); contact1.first_name = String::from( Jane ); println!( Contact name is {} {} , contact1.first_name, contact1.last_name);Note that to assign to first_name we had to make contact1 mutable and thatthis mutability applies to the entire structure, not to each field. No surprisesfor C++ programmers thereeither.Now there are a couple more unique features that are worth mentioning. The firstof them comes when creating constructor methods. Let s say we want to avoidhaving to set the business field, so we wrap it up in afunction: 113fn new_business_contact(first_name: String, last_name: String, email: String, age: u8) - Contact { Contact { first_name: first_name, last_name: last_name, email: email, age: age, business: trueHowever, it s a bit tedious repeating all those field names in the body. Well,if the function parameters happen to match the field names you can use ashorthand forthis: 113fn new_business_contact(first_name: String, last_name: String, email: String, age: u8) - Contact { Contact { first_name, last_name, email, age, business: trueAnother convenient syntactic trick is the struct update syntax, which canbe used to create a copy of another struct with somechanges:19let contact1 = Contact {let contact2 = Contact { first_name: String::from( John ), last_name: String::from( Smith ), ..contact1This will duplicate all fields not explicitly changed. There can be a stingin this particular tail, though, due to the ownership rules. In this example,the String value from contact1.email will be moved into contact2.emailand so the first instance will no longer be valid after thispoint.Finally in this section I ll briefly talk about tuples. I m talking aboutthem here rather than along with other compound types because I feel theywork in a very similar way to structs, just without the field names. They havea fixed size defined when they are created and this cannot change, as with anarray. Unlike an array, however, they are heterogeneous: they can containmultiple differenttypes.One thing that might surprise Python programmers in particular, however, isthat the elements of a tuple are accessed using dot notation in the same wayas a struct. In a way you can think of it as a struct where the names of thefields are just automatically chosen as base-zerointegers.16fn main() { let tup = (123, 4.56, hello ); println!( {} {} {} , tup.0, tup.1, tup.2); // Can also include explicit types for the tuple fields. let tup_copy: (u32, f64, String) = tup;If you want to share the definition of a tuple around in the same way asfor a struct but you don t want to give the fields names, you can usea tuple struct to dothat:16struct Colour(u8, u8, u8);fn main() { let purple = Colour(255, 0, 255); println!( R={} G={}, B={} , purple.0, purple.1, purple.2);In all honesty I m not entirely sure how useful that ll be, but time willtell.The final note here is that structs can also hold references, although noneof the examples here utilised that. However, doing so means exercising alittle more care because the original value can t go out of scope anytime before any structs with references to it. This is a topic for a futurediscussion on lifetimes.EnumerationsContinuing the theme of data types that C++ offers, Rust also hasenumerations, hereafter referred to as enums. Beyond the name thesimilarity gets very loose, however. In C++ enums are essentially a wayto add textual aliases to integral values; there s a bit of syntacticsugar to treat them as regular values, but you don t have to dip yourtoes too far under the water to get them bitten by aninteger.In Rust, however, they have features that are more like a union in C++,although unlike a union they don t rely on the programmer to know whichvariant is in use at any giventime.You can use them very much like a regular enum. The values defined withinthe enum are scoped within the namespace of the enumerationname1.19enum ContactType { Personal, Colleague, Vendor, Customer,let contact1_type = ContactType::Personal;let contact2_type = ContactType::Vendor;However, much more powerfully than this these variants can also have datavalues associated with them, and each variant can be associated withits own datatype. 113// We reference contacts by their email address except for// colleagues, where we use employee number; and vendors,// where we use supplier ID, which consists of three numbers.enum ContactType { Personal(String), Colleague(u64), Vendor(u32, u32, u32), Customer(String)let customer = ContactType::Customer( andy@example.com );let colleague = ContactType::Colleague(229382);let supplier = ContactType::Vendor(23, 223, 4);This construct is great for implementing the sort of code where you needto branch differently based on the underlying type of something. I can justhear the voices of all the object-orientation purists declaring thatpolymorphism is the correct solution to this problem: that everything shouldbe exposed as an abstract method in the base class that all the derivedclasses implement. I wouldn t say I disagree necessarily, but I would alsosay that this isn t a clean fit in every case and polymorphism isn t theone-size-fits-all solution as which it has on occasion beenpresented.Rust implements some types of polymorphism and features such as traits area useful alternative to inheritance for code reuse, as we ll see in a laterpost. But since Rust doesn t implement true inheritance,more properly called subtype polymorphism, then Isuspect this flexibility of enumerations is more important in Rust thanit would be inC++.A little further down we ll see how to use the match operator to do this sortof switching in an elegant way, but first we ll see one example of a pre-definedenum in Rust that s particularly widelyused.OptionIt s a very common case that a function needs to return a value in the happycase or raise some sort of error in the less happy case. Different languageshave different mechanisms for this, one of the more common in modern languagesbeing to raise exceptions. This is particularly common in Python, whereexceptions are used for a large proportion of the functionality, but it s alsoquite normal in C++ where the function of the destructors and the stackunwinding process are both heavily oriented around making this a fairly safeprocess.Despite its extensive support for exceptions, however, C++ is still a bit of ahybrid and it has a number ofcases where its APIs still use the other primary method of returning errors,via the return value. A good example of this is thestd::string::find() method which searches for asubstring within the parent string. This clearly has two different classesof result: either the string is found, in which case the offset within theparent string is returned; or the string is not found, in which case themethod returns the magic std::string::npos value. In other cases functionscan return either a pointer for the happy case or a NULL in case oferror.Rust does not support exceptions. This is for a number of reasons, partlyrelated to the overhead of raising exceptions and also the fact that returnvalues make it easier for the compiler to force the programmer to handle allerror cases that a function canreturn.To implement these error returns in Rust, therefore, is where the Option enumcomes in useful. It s defined something likethis:14enum Option T { Some(T). None,This enum is capable of storing some type T which is a template type(generics will be discussed properly in a later post), or the singlevalue None. This allows a function to return any value type it wishes,but also leave open the possibility of returning None for anerror.That s about all there is to say about Option, and we ll see theidiomatic way to use it in the nextsection.MatchingThe final thing I m going to talk about is the match flow controloperator. This is conceptually similar to the switch statementin C++, but it s got rather more cleverness up itssleeves.The first thing to note about match is that unlike switch in C++it is an expression instead of a statement. One aspect of Rust I haven ttalked about yet is that expressions may contain statements, however,so this isn t a major obstacle. But it does mean that it s fairly easyto use simple match expressions in assignments or as returnvalues: 115enum Direction { North, South, East, West,fn get_bearing(d: Direction) - u16 { match d { Direction::North = 0, Direction::East = 90, Direction::South = 180, Direction::West = 270,The match expression has multiple arms which have a pattern and aresult expression. To do more than just return a value from the expression,we can wrap it inbraces: 117fn get_bearing(d: Direction) - u16 { match d { Direction::North = 0, Direction::East = { println!( East is East ); Direction::South = { println!( Due South ); Direction::West = { println!( Go West );We can use the patterns to do more than just match specific values, though.Taking the Option type from earlier, we can use it to extract the return valuesfrom functions whilst still ensuring we handle all the errorcases.For example, the String::find() method searches for a substring and returnsan Option usze which is None if the value wasn t found or the offset withinthe string if it was found. We can use this to, say, extract the domain partfrom an emailaddress:16fn get_domain(email: String) - str { match email.find( @ ) { None = , Some(x) = email[x+1..],This function takes a String reference and returns a string slice representingthe domain part of the email, unless the email address doesn t contain an @character in which case we return an empty string. I m not going to say thatthe semantics of an empty string are ideal in this case, but it s just anexample.As another example we could write a function to display the contact details forthe ContactType definedearlier: 123enum ContactType { Personal(String), Colleague(u64), Vendor(u32, u32, u32), Customer(String)fn show_contact(contact: ContactType) { match contact { ContactType::Personal(email) = { println!( Personal: {} , email); ContactType::Colleague(employee_number) = { println!( Colleague: {} , employee_number); ContactType::Vendor(id1, id2, id3) = { println!( Vendor: {}-{}-{} , id1, id2, id3); ContactType::Customer(email) = { println!( Customer: {} , email);One aspect of match statements that isn t immediately obvious is that theyare required to be exhaustive. So, if you don t handle every time enumvalue, for example, then you ll get a compile error. This is what makesthings like the Option example particularly safe as it forces handlingof all errors, which is generally regarded as a good practice if you rewriting robust code. This also makes perfect sense if you consider thatmatch is an expression: if you assign the result to a variable, say, thenthen compiler needs something to assign and if you hit a case that yourmatch doesn t handle then what s the compiler going todo?Of course if we re using match for something other than an enum then handlingevery value would be pretty tedious. For these cases we can use the pattern_ as the default match. The example below also shows how we can match multiplepatterns using | as aseparator:16fn is_perfect(n: u32) - bool { match n { 6 | 28 | 496 | 8128 | 33_550_336 = true, _ = falseHere we re meeting the needs of match by covering every single case. If weremoved that final default arm, the compiler wouldn t let us get away withit:error[E0004]: non-exhaustive patterns: `0u32..=5u32`,`7u32..=27u32`, `29u32..=495u32` and 3 more not covered -- src/main.rs:10:1110 | match n { | ^ patterns `0u32..=5u32`, `7u32..=27u32`,`29u32..=495u32` and 3 more not covered = help: ensure that all possible cases are being handled,possibly by adding wildcards or more match armsBut what if we really wanted to only handle a single case? It would be prettydull if we had to have a default arm in a match then check for that valuebeing returned and ignoreit.Let s take the get_domain() example from earlier. Let s say that if youfind a domain, you want to use it; but if not, you have some more complicatedlogic to invoke to infer the domain by looking at the username. You couldhandle that by doing something likethis: 111fn get_domain(email: String) - str { let ret = match email.find( @ ) { None = , Some(x) = email[x+1..], if ret != { ret; } else { // More complex logic goes here...But that s a little clunky. Rust has a special syntax called if letfor handling just a single case likethis:17fn get_domain(email: String) - str { if let Some(x) = email.find( @ ) { email[x+1..]; } else { // More complex logic goes here...I only recently came across this syntax and my opinions are honestly alittle mixed. Whilst I find the match statements comprehensible andintuitive, this odd combination of if and let just seems unusual to me.Mind you, I suspect it s a common enough case to beuseful.So that s a whirlwind tour of match and Rust s pattern-matching.It s important to note that this is a much more powerful feature than I vemanaged to express here as we ve only really discussed matching by literalsand by enum type. In general patterns can be used in fairly creative waysto extract fields from values at the same time as matching literals, and theycan even have conditional expressions added, which Rust calls match guards.These are illustrated in the (rather contrived!) examplebelow: 137struct Colour { red: u8, green: u8, blue: u8fn classify_colour(c: Colour) { match c { Colour {red: 0, green: 0, blue: 0} = { println!( Black ); Colour {red: 255, green: 255, blue: 255} = { println!( White ); Colour {red: r, green: 0, blue: 0} = { println!( Red {} , r); Colour {red: 0, green: g, blue: 0} = { println!( Green {} , g); Colour {red: 0, green: 0, blue: b} = { println!( Blue {} , b); Colour {red: r, green: g, blue: 0} = { println!( Brown {} {} , r, g); Colour {red: r, green: 0, blue: b} = { println!( Purple {} {} , r, b); Colour {red: r, green: g, blue: b} if r == b r == g = { println!( Grey {} , r); Colour {red: r, green: g, blue: b} = { println!( Mixed colour {}, {}, {} , r, g, b);Hopefully most things there are fairly self-explanatory and in any caseit s just intended as an illustration of the sorts of facilities thatare available. It s also worth mentioning that the compiler does giveyou some help to detect if you re masking patterns with earlier ones,but it doesn t appear to be perfect. For example, if I moved the firsttwo matches to the end of the list, they re both correctly flagged asunreachable. However, if I move the pattern for white after thepattern for grey it didn t generate a warning; I m guessing the jobof determining reachability around match guards is just too difficultto doreliably.ConclusionsRust s type system certainly offers some powerful flexibility, and thepattern matching looks like a fantastic feature for pulling apartstructures and matching special cases within them. The specificOption enum also looks like quite a pleasant way to implement the value or error case given that Rust doesn t offer exceptions forthispurpose.My main reservation around these features is that there s an awfullot of syntax building up here, and it s a fine line between a goodamount of expressive power and edging into Perl s there s too manyways to do it philosophy. The if let syntax in particular seemspossibly excessive to me. But I m certainly reserving judgement onthat for now until I ve had some more experience with thelanguage.For anyone familiar with C++11, this is what you get when you declare a C++ enum with enum class MyEnum { … }. Rust is fairly new multi-paradigm system programmating langauge that claims to offer both high performance and strong safety guarantees, particularly around concurrency and memory allocation. As I play with the language a little, I m using this series of blog posts to discuss some of its more unique features as I come across them. This one talks about Rust s ownershipmodel. Over the last few years I ve become more aware of the Rust programming langauge.Slightly more than a decade old, it has consistently topped theStack Overflow Developer Surveyin the most loved langauge category for the last four years, so there s clearly adecent core of very keen developers using it. It aims to offer performance on a parwith C++ whilst considerably improving on the safety of the language, so as a long-timeC++ programmer who s all too aware of its potential for painfully opaque bugs, Ithought it was definitely worth checking what Rust brings to thetable.As the first article in what I hope will become a reasonable series, I shouldbriefly point out what these articles are not. They are certainly not meant to bea detailed discussion of Rust s history ordesign principles, nor a tutorial.The official documentation and other sources already do a great jobof thosethings.Instead, this series is a hopefully interesting tour of some of theaspects of the language that set it apart, enough to get a flavour of it andperhaps decide if you re interested in looking further yourself. I m specificallygoing to be comparing the language to C++ and perhaps occasionally Python asthe two languages with which I m currently mostfamiliar.MutabilityBefore I get going on the topic of this post, I feel it s important to clarifyone perhaps surprising detail of Rust to help understand the code examples below,and it is this: all variables are immutable by default. It s possible to declareany variable mutable by prefixing with the mut keyword.I could imagine some people considering this is a minor syntactic issue as itjust means what would be const in C++ is non-mut in Rust, and non-const inC++ is mut in Rust. So why mention it? Well, mostly to help people understandthe code examples a little easier; whilst it s debatably not a fundamental issue,it s also not something that s necessarily self-evident from the syntaxeither.Also, I think it s a nice little preview of the way the language pushes youtowards one of its primary goals: safety. If you forget the modifier things defaultto the most restrictive situation, and the compiler will prod you to add the modifierexplicitly if that s what you want. But if it isn t what you want, you get the hintto fix a potential bug. Immutable values typically also make it much easier totake advantage of concurrency safely, but that s a topic for a futurepost.OwnershipSince one of the touted features of the language is safety around memory allocation,I m going to start off outlining how ownership works inRust.Ownership is a concept that s stressed many times during the Rust documentation,although in my view it s pretty fundamental to truly understanding any language.Manipulating variables in memory is the bulk of what software does most of thetime and errors around ownership are some of the most common sources of bugsacross multiplelangauges.In general owning a value in this context means that a piece of code has aresponsibility to manage the memory associated with that value. This isn t aboutmutability or any other concept people might feasibly regard as forms ofownership.Just to be clear, I m going toskip discussion of stack-allocated variables here. Management of data on the stackis generally similar in all mainstream imperative languages and generally fallsout of the language scoping rules quite neatly, so I m going to focus this discussionon the more interesting and variable topic of managing heapallocations.In C++ ownership is a nebulous concept and left for the programmer to define. Thelanguage provides the facility to allocate memory and it s up to the programmer todecide when it s safe to free it. Techniques such as RAII allow a heapallocation to be tied to a particular scope, either on the stack or linked withan owning class, but this must be manually implemented by the programmer. It s quiteeasy to neglect this in some case or other, and since it s aggressively optionalthen the compiler isn t going to help you police yourself.As a result, memory mismangement is a very common class of bugs in C++code.Higher-level languages tend to utilise different forms of garbage collection toavoid exposing the programmer to these issues. Python s reference counting isa simple concept and covers most cases gracefully, although it adds peformanceoverhead to many operations in the language and cyclic references complicate matterssuch that additional garbage collection algorithems are still required. Languageslike Java with tracing garbage collectors impose less performance penalty onaccess than reference counting, but may be prone to spikes of sudden load whena garbage sweep is done. These systems are also often more complex to implement,especially as in the real world they re often a hybrid of multiple techniques.This isn t necessarily a direct concern for the programmer, as someone else hasdone all the hard work of implementing the algorithm, but it does inch up the riskof hitting unpredictable pathalogical performance behaviour. These can be the sortof intermittent bugs that we all love to hate toinvestigate.All this said, Rust takes a simpler approach, which I suppose you could think ofas what s left of reference counting after a particularly aggressive assult fromOckham s Razor.Rust enforces three simple rules ofownership:Each value has a variable which is the owner.Each value has exactly one owner at atime.When the owner goes out of scope the value is dropped1.I m not going to go into detail on the scoping rules of Rust right now, althoughthere are some interesting details that I ll probably cover in another post. Fornow suffice to say that Rust is lexically scoped in a very similarway to C++ where variables are in scope from their definition until the end of theblock in which they re defined2.This means, therefore, that because a value has only a single owner, and becausethe scope of that owner is well-defined and must always exit at some point, thereis no possible way for the value to not be dropped and its memory leaked. Henceachieving the promised memory safety with some very simple rules that can bevalidated atcompile-time.So there you go, you assign a variable and the value will be valid until such pointas that variable goes out of scope. What could besimpler?19// Start of block. // String value springs into existence. let my_value = String::from( hello, world ); println!( Value: {} , my_value);// End of block, my_value out of scope, value dropped.Moving rightalongWell of course it s not quite that simple. For example, what happens if we assign thevalue to another variable? I mean, that s a pretty simple case. How hard can it beto figure out what this code willprint?15fn main() { let my_value = String::from( hello, world ); let another_value = my_value; println!( Values: {} {} , my_value, another_value);The answer is: slightly harder than you might imagine. In fact the code above won t evencompile: Compiling sample v0.1.0 (/Users/apearce16/src/local/rust-tutorial/sample)error[E0382]: borrow of moved value: `my_value` -- src/main.rs:4:312 | let my_value = String::from( hello, world | -------- move occurs because `my_value` has type`std::string::String`, which does not implement the `Copy` trait3 | let another_value = my_value; | -------- value moved here4 | println!( Values: {} {} , my_value, another_value); | ^^^^^^^^ value borrowed here after move error: aborting due to previous errorThis is because Rust implementsmove semantics by default on assignment. So what s really happening in the codeabove is that a string value is created and ownership is assigned to the my_valuevariable. Then this is assigned to another_value which results in ownership beingtransferred to the another_value variable. At this point the my_value variableis still in scope, but it s no longervalid.The compiler is pretty comprehensive in explaining what s going on here, the value ismoved in the second line and then the invalidated my_value is referenced in the thirdline, which is what triggers theerror.This may seem unintuitive to some people, but before making any judgements you shouldconsider the alternatives. Firstly, Rust could abandon its simple ownership rules andallow arbitrary aliasing like in C++. Except that would mean either exposing manualmemory management or replacing it with a more expensive garbage collector, both ofwhich compromise on the goals of safety and performancerespectively.Secondly, Rust could perform adeep copy of the data on the assignment, so duplicating the value and ending up withtwo variables each with its own copy. This is workable, but defeats thegoal of performance as memory copying is pretty slowif you end up doing an awful lot of it. It also violates a basic programmer expectationthat a simple action like assignment should not beexpensive.And so we re left with the move semantics defined above. It s worth noting, however,that this doesn t apply to all types. Some are defined as being safe to copy: generallythe simple scalar types such as integers, floats, booleans, and so on. The key propertyof these which make them safe is that they re stored entirely on the stack, there sno associated heap allocation to handle.It s also possible to declare that new types are safe to copy by adding the Copytrait, but traits are definitely a topic for a laterpost.It s also worth noting that these move semantics are not as restrictive as they mightseem due to the existence of references, which I ll talk about later in this post.First, though, it s interesting to look at how these semantics work withfunctions.Onwnership in and out offunctionsThe ownership rules within a scope are now clear, but what about passing values intofunctions? In C++, for example, arguments are passed by value which means that thefunction essentially operates on a copy. If this value happens to bea pointer or reference then of course the original value may be modified, butas mentioned above we re deferring discussion of references in Rust for amoment.Argument passing would appear to suffer the same issues as the assignment example above,in that we don t want to perform a deep copy, but neither do we want to complicate theownership rules. So it s probably little surprise that argument passing into functionsalso passes ownership in the same way as theassignment.This code snippet will fail tocompile: 112fn main() { let s = String::from( hello ); my_function(s); // Oops, s isn t valid here any more! println!( Value of s: {} , s);fn my_function(arg: String) { // Ownership passes to the arg parameter. println!( Now I own {} , s); // Here arg goes out of scope and the String is dropped.Although this may seem superficially surprising, when you really think about itargument passing is just a fancy form of assignment into a form of nested scope,so it shouldn t be a surprise that it follows the samesemantics.The same logic applies to function return values, and this is where things couldget slightly surprising for C++ programmers who are used to returning pointers orreferences to stack values being a tremendous source of bugs; and returningnon-referential values as a cause of potentially expensive copyoperations.In C++ when the function call ends, any pointer orreference to anything on its stack that is passed to the caller will now beinvalid. These can be some pretty nasty bugs, particuarly for less experiencedprogrammers. It doesn t help that the compiler doesn t stop you doing this, and alsothat these situations often give the appearance of working correctly initially,since the stack frame of the function has often not been reused yet so the pointerstill seems to point to valid data immediately after the call returns. This clearlyharms the safety of thecode.If the programmer decides to resolve this issue by returning a complex classdirectly by value instead of by pointer or reference, thenthis generally entails default construction of an instance in the caller, thenexecution of the function and then assignment of the returned value to theinstance in the caller which might involve some expensive copying. This potentiallyharms the performance of thecode.I m deliberately glossing over some subtleties here aroundreturning temporary objects,return value optimisation andmove semantics in C++ which are all well outside the scopeof this post on Rust. But even though solutions to these issues exist, theyrequire significant knowledge and experience on the part of the programmer totake advantage of correctly, particularly for user-definedclasses.In Rust things are simpler: you can return a local value and ownership passesto the caller in the obviousmanner. 125fn main() { let my_value = create(); // At this point my_value owns a String. println!( Now I own {} , my_value); let another_value = transform(my_value); // At this point another_value owns a string, // but my_value is now invalid. println!( Now I own {} , another_value);fn create() - String { let new_str = String::from( hello, world ); // Ownership will pass to the caller. new_strfn transform(mut arg: String) - String { // We ve delcared the argument mutable, which is OK // since ownership has passed to us. We append some // text to it and then return it, whereupon ownership // passes back to the caller. arg.push_str( !!! );For anyone puzzled by the bare expressions at the end of the functions on lines 15 and24, suffice to say for now this is an idiomatic way to return a value in Rust. Thelanguage does have a return statement, but a bare expression also works in somecases. I ll discuss this more in a laterpost.So in the case of return values, the move semantics of ownership in Rust turn outto be pretty useful: the ownership passes to the caller safely and with no need forexpensive copying, since somewhere under the hood it s just a transfer of some referenceto a value on the heap. Since the rules apply everywhere it all feels quiteconsistent andlogical.But as logical as it is, it may seem awfully inconvenient. There are many cases wewant a value to persist after it has been operated on by a function. It would beannoying to have to deep-copy an object every time, or to constantly have to returnthe argument to the caller as in the exampleabove.Fortunately Rust provides references to resolve thisinconvenience.ReferencesIn Rust references provide a way to refer to a value without actually takingownership of it. The example below demonstrates the syntax, which is quitereminiscent ofC++: 124fn main() { let my_string = String::from( one two three ); let num_words = count_words( my_string); // my_string is still valid here. println!( {} has {} words , my_string, num_words);// I m sure there are more elegant ways to implement// this function, this is just for illustrating the point.fn count_words(s: String) - usize { let mut words = 0; let mut in_word = false; for c in s.chars() { if c.is_alphanumeric() { if !in_word { words += 1; in_word = true; } else { in_word = false; wordsThe code example above shows a value being passed by immutable reference. Note thatthe function signature needs to be updated to take a reference String, but thecaller must also explicitly declare the parameter to be a reference with my_string.This is unlike in C++ where there s no explicit hint to someone reading the code inthe caller that a value might be passed by reference. For immutable references (orconst refs in C++ parlance) this isn t a big deal, but I ve always felt that it salways important to know for sure whether a function might modify one of its parametersin-place, and in C++ you have to go check the function signature every time to tellwhether this is the case. This has always been one of my biggest annoyances with C++syntax and it s great to see it s been addressed inRust.Taking a reference is rather quaintly known as borrowing in Rust. You can take asmany references to a value as you like as long as they re immutable.16fn main() { let mut my_value = String::from( hello, world ); let ref1 = my_value; let ref2 = my_value; let ref3 = my_value;Of course, attempting to modify the value through any of these references will resultin a compile error, since they re immutable. As you d expect it s also possible to takemutablereferences:19fn main() { let mut my_value = String::from( world ); prefix_hello( mut my_value); println!( New value: {} , my_value);fn prefix_hello(arg: mut String) { arg.insert_str(0, hello );This example also illustrates that it s once again clear in the context of the callerthat it s specifically a mutable reference that s beingpassed.This all seems great, but there s a couple of restrictions I haven t mentioned yet.Firstly, it s only valid to have a single mutable reference to a value at once.If you try to create more than one you ll get an error at compile-time.Secondly, you can t have both immutable and a mutable reference valid at the sametime, which would also be a compile-timeerror.The logic behind this is around safety when values are used concurrently. These rulesdo a good job of ruling out race conditions, as it s not possible to multiplereferences to the same object unless they re all immutable, and if the data doesn tchange then there can t be a race. It s essentially amultiple readers/single writer lock.The compiler also protects you against creating dangling references, such as returninga reference to a stack function. That will fail to compile3.A slice oflifeWhilst I m talking about references anyway, it s worth briefly mentioning slices.These are like references, but they only refer to a subset of acollection.16fn main() { let my_value = String::from( hello there, world ); // String slice there . let there = my_value[6..11]; println!( {} , there);The example above shows a use for an immutable string slice. Actually you may notrealised it but you ve seen one of those earlier in this post: all string literalsare in fact immutable stringslices.As with slices in most languages the syntax is a half-open interval where the firstindex is inclusive, the second exclusive. It s also possible to have slices ofother collections that are contiguous and it s possible to have mutable slicesaswell.17fn main() { let mut my_list = [1,2,3,4,5]; let slice = mut my_list[1..3]; slice[1] = 99; // [1, 2, 99, 4, 5] println!( {:?} , my_list);As far as I ve been able to tell so far, however, it doesn t seem to be possibleto assign to the entirity of a mutable slice to replace it. I can understand severalreasons why this might not be a good idea to implement, not least of which that itcan change the size of the slice and hence necessitate moving items around in memorythat aren t even part of the slice (if you assign something of a different length).But I thought it was worthnoting.ConclusionsIn this post I ve summarised what I know so far about ownership and references in Rustand generally I think it s shaping up to be a pretty sensible language. Of courseit s hard to say until you ve put it to some serious use4, but I can see thatthere are good justifications for the quirks that I ve discovered so far, bearingin mind the overarching goals of thelanguage.The ownership rules seem simple enough to keep in mind in practice, and it remains tobe seen whether they will make writing non-trivial code more cumbersome than it needsto be. I like the explicit reference syntax in the caller and whilst the move semanticsmight seem odd at first, I think they re simple and consistent enough to get used topretty quickly. The fact that the compiler catches so many errors should beparticularly helpful, especially as I ve found its output format to be pleasantlydetailed and particularly helpful compared to many otherlanguages.What you would call memory being freed in C++ is referred to as a value being dropped in Rust. The meaning is more or less the same. Spoiler alert: the scope of a variable in Rust actually extends to the last place in the block where it is referenced, not necessarily to the end of the block, but that doesn t materially alter the discussion of ownership. Unless you specify the value has a static lifetime but I ll talk about lifetimes another time. I came across Perl in 1999 and thought it was a pretty cool from learning it right up until I had to try to fix bugs in the first large project I wrote in it, so it just goes to show that first impressions of programming languages are hardly infallible. ☑ Tracing MacOS FilesystemEvents Recently I had cause to find out where a particular process is currently writing a file on MacOS and I wanted to describe how I went about it forreference. Now I should point out at this stage that I m very far from a MacOS expert.I know a few basics, but generally things are slick enough that I don t tendto need to drop down to the terminal to do a lot. As a result, I m stilldiscovering little corners where MacOS either provides better tools than I mused to on Linux, or has some quirky differences to how theywork.Disclaimer aside, here s the deal. I had this process, which I knew wasdownloading a file. I knew it was a very large file; but I didn t know whereit was being downloaded to. I knew the end destination, but it rapidly becameclear this process was downloading it to somewhere temporary, so it couldpresumably rename it into place later. I wanted to monitor the size of thefile so I could see how far along it was, so I could figure out how long Iwas able to spend making a cup of tea. Importantstuff.My go-to solution to this issue on Linux would be to locate the PID, then justls -l /proc/ pid /fs. This lists all open filehandles for a process andthey re shown as symlinks to the open file.Handy.MacOS, sadly, does not have /proc in any form. A little Googling around thesubject did turn up something called fs_usage,however.This is in the same vein as strace on Linux, except it s a little more specific.I won t go into full details, but suffice to say it logs all filesystem (and other)activities on the machine. Or you can provide a PID or process name and it ll focusin onthat.So I ended up running something likethis:sudo fs_usage -f filesys pid 1234This shows all filesystem events for PID 1234. The output you get looks a little likethis:18:02:54.019999 getattrlist /tmp/filename18:02:54.020828 getattrlist /tmp/filename18:02:54.020907 getxattr [ 93] /tmp/filename18:02:54.022703 open F=24 (R_____) /tmp/filename18:02:54.022706 fcntl F=24 GETPATH 18:02:54.022711 close F=2418:02:54.025311 open F=24 (R_____) /tmp/filename18:02:54.424061 write F=24 B=0x190dI ve simplified and truncated the output a little, but you get theidea.This was great and a concrete step forward. You ll notice, however, that thetrace for read() and write() calls doesn t print the filename that s beingmanipulated. That s probably because those calls operate only on a filehandle,and this tool doesn t want to delve into process state, it just wants to writethe parameters to the call out and get on withit.That s fine if, as in the trace above, you ve captured the open() call; youcan use that F=24 to link up the traces and figure out which file is beingupdated.If, however, you come in halfway through then that s not a lot of help; you dneed to persuade the process to close and re-open the file on demand, andthat s pretty hairystuff.What we need, then, is a way to look up this file descriptor 24 into a filepath.What we need, then is lsof.This is a sufficiently standard Unix tool that its ownWikipedia page1, so I won t gointo an in-depth discussion. Suffice to say that its core competency is listingthe open files of processes, and that s exactly what we needhere.You can invoke it as lsof -p 1234 and it will show something likethis:COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAMEPython 5715 myuser cwd DIR 1,4 256 4297892359 /Users/myuserPython 5715 myuser txt REG 1,4 51744 4304043073 /System/Library/…/Python.app/Contents/MacOS/PythonPython 5715 myuser txt REG 1,4 52768 4304046592 /System/Library/…/lib-dynload/_locale.soPython 5715 myuser txt REG 1,4 63968 4304046668 /System/Library/…/lib-dynload/readline.soPython 5715 myuser txt REG 1,4 973824 4304094362 /usr/lib/dyldPython 5715 myuser 0u CHR 16,0 0t101338 903 /dev/ttys000Python 5715 myuser 1u CHR 16,0 0t101338 903 /dev/ttys000Python 5715 myuser 2u CHR 16,0 0t101338 903 /dev/ttys000Python 5715 myuser 3r REG 1,4 6804 4297372781 /private/etc/passwdIn this example you can see that my Python process has a cwd entry whichcorresponds to its current working directory (which it has open) as well astxt entries which correspond to the binary itself and various sharedlibraries. This will be populating thetext segment.Then we have four entries where the FD column reads 0u, 1u, 2u and 3r.The numbers represent the file descriptors of the open files withinthe process and as those familiar with Unix will recognise the first threecorrespond to standard input, output and error respectively. Perhaps slightlyoddly the process has all three open for both read and write, as indicated bythe u suffix; since these are all open on the terminal device I can onlyassume that this is just some quirk of the default way the OS creates theprocess.The final file descriptor 3r shows that the process has file /private/etc/passwdopen for reading (indicated by the r suffix), which is exactly right. This was aninteractive Python process and I d just run fd = open('/etc/passwd'). You llnotice lsof is giving us the real absolute path name; I d opened /etc/passwdbut since on MacOS /etc is a symlink to /private/etc then the path that sreported above is in that destinationfolder.So now we have all the pieces we need: we fun fs_usage to find out the filedescriptors that a process is accessing and then we can map these to filenamesusing lsof.Frankly there are probably slightly easier ways to solve the problem, but theseare probably handy utilties for future reference so I don t regret the path2Itook.Pun very much intended, I m afraid. Sorry about that.

TAGS:That Good Enough

<<< Thank you for your visit >>>

Andy Pearce's personal blog on life, software and everything. Well, just life and software, mostly.

Websites to related :
Electric Cars - Plug-In Hybrid

　　Chevrolet BoltThe Chevy Bolt was the auto industry’s first affordable long-range electric car. It starts at $37,500 and provides 238 miles on a singl

QUnit

　　Easy Easy, zero configuration setup for any Node.js project and minimal configuration for Browser-based projects. Universal Tests can be run anywhere;

Motorcycle Rentals - Motorcycle

　　Pacific Coast Highway Motorcycle Tour Zion and Grand Canyon National Park Motorcycle Tour Florida Motorcycle Tour Southern California Motorcycle Tour

Massage Full Body Massage and Ma

　　Touch is a very important part of human contact. It helps us feel connected and loved by those around us.Massage uses touch through rubbing or kneadin

Telescope making

　　In Memoriam, Berislav Bracun (1971-2013) We are incredibly sorry to inform you that Berislav, the author of these pages, died suddenly on 13th of Apri

libbitcoin - modular, scalable,

　　Welcome To The libbitcoin Community Visit our team page and learn more about us. libbitcoin is a community of developers building the open source libr

Benvenuti al sito di Alex Martel

　　Questo sito contiene materiale sia in italiano, sia in inglese; clicca i link nel linguaggio che preferisci. Alcuni dei materiali sono disponibili sol

Wild Thangs (online series)

　　Created by: Jay Thompson Writers: Jay Thompson (Head-Writer) Laurel Reins (Co-Writer) Amanda Carter(Editor, Interaction)Welcome! Wild Thangs is a ne

Lifestyle Chiropractic & Wellnes

　　Welcome to Lifestyle Chiropractic & Wellness CenterYour Kansas City Chiropractor For more information or to schedule an appointment with our Kansas Ci

LP Gear. The Global analog resou

　　Audio-Technica AT-ART9 phono cartridge$990.00 Sign Up To Our Newsletterand receive savings in your inboxOur StoryAt your service for 20+ years, LP Gea

ads

Hot Websites

WHY YOU SHOULD Be listed

Matt Cutts, a software engineer and former head of the web spam team at Google, stated back in 2012 that "spammy directories" are disregarded while those who "excercise editorial discretion" worth it. Even though countless business directories litter the internet, they won't all offer a positive change. Don't take unneeded risks and suggest your website to one that Google loves!

Email: liunxqq$$$126.com ($$$=@) ie websites |