furby 2015-09-19 15:59 +03:00

And when I think about it, a lot of "things I want to do" are just learning about and discussing new tools for tinkering with the chain.

URL: http://xkcd.com/1579/

Cargo Cult - 03:02

Ads by Project Wonderful! Your ad could be here, right now.

URL: http://questionablecontent.net/view.php?comic=3050

Twitter's Tech Lead On Making Software Engineers More Efficient - 02:00

Tekla Perry writes: "Engineering productivity is hard to measure," said Peter Seibel, the tech lead of Twitter's engineering effectiveness group. "But we certainly can harm it." Seibel spoke this week at the @Scale conference in San Jose, hosted by Facebook. He says in large companies one third of software engineers shouldn't be working on the company's products, but should be dedicated to making other engineers more effective. "As an industry we know how to scale up software," he said. "We also know how to scale up organizations, to put in management that lets thousands of people work together. But we don't have a handle on how to scale up that intersection between engineering and human organization. And maybe we don't understand the importance of that. We massively underinvest in this kind of work."

Read more of this story at Slashdot.

URL: http://developers.slashdot.org/story/15/09/17/2220200/twitters-tech-lead-on-making-software-engineers-more-efficient?utm_source=rss1.0mainlinkanon&utm_medium=feed

Proton Earth, Electron Moon - 00:00

Proton Earth, Electron Moon

What if the Earth were made entirely of protons, and the Moon were made entirely of electrons?

—Noah Williams

This is, by far, the most destructive What-If scenario to date.

You might imagine an electron Moon orbiting a proton Earth, sort of like a gigantic hydrogen atom. On one level, it makes a kind of sense; after all, electrons orbit protons, and moons orbit planets. In fact, a planetary model of the atom was briefly popular (although it turned out not to be very useful for understanding atoms.[1]This model was (mostly) obsolete by the 1920s, but lived on in an elaborate foam-and-pipe-cleaner diorama I made in 6th grade science class.)

The continents are tightly bound, and attempts to split them create energetic jets of particles, or "lava."

If you put two electrons together, they try to fly apart. Electrons are positively charged, and the force of repulsion from this charge is about 20 orders of magnitude stronger than the force of gravity pulling them together.

If you put 10⁵² electrons together—to build a Moon—they push each other apart really hard. In fact, they push each other apart so hard, each electron would be shoved away with an unbelievable amount of energy.

It turns out that, for the proton Earth and electron Moon in Noah's scenario, the planetary model is even more wrong than usual. The Moon wouldn't orbit the Earth because they'd barely have a chance to influence each other;[2]I interpreted the question to mean that the Moon was replaced with a sphere of electrons the size and mass of the Moon, and ditto for the Earth. There are other interpretations, but practically speaking the end result is the same. the forces trying to blow each one apart would be far more powerful than any attractive force between the two.

If we ignore general relativity for a moment—we'll come back to it—we can calculate that the energy from these electrons all pushing on each other would be enough to accelerate all of them outward at near the speed of light.[3]But not past it; we're ignoring general relativity, but not special relativity. Accelerating particles to those speeds isn't unusual; a desktop particle accelerator can accelerate electrons to a reasonable fraction of the speed of light. But the electrons in Noah's Moon would each be carrying much, much more energy than those in a normal accelerator—orders of magnitude more than the Planck energy, which is itself many orders of magnitude larger than the energies we can reach in our largest accelerators. In other words, Noah's question takes us pretty far outside normal physics, into the highly theoretical realm of things like quantum gravity and string theory.

So I contacted Dr. Cindy Keeler, a string theorist with the Niels Bohr Institute. I explained Noah's scenario, and she was kind enough to offer some thoughts.

[20 years later] Ok, while all theories are consistent with those observations, half of them would NOT be consistent with these alternate ones. So in some alternate universes, we could definitely rule these ones out.

Dr. Keeler agreed that we shouldn't rely on any calculations that involve putting that much energy in each electron, since it's so far beyond what we're able to test in our accelerators. "I don't trust anything with energy per particle over the Planck scale. The most energy we've really observed is in cosmic rays; more than LHC by circa 10⁶, I think, but still not close to the Planck energy. Being a string theorist, I'm tempted to say something stringy would happen—but the truth is we just don't know."

Luckily, that's not the end of the story. Remember how we're ignoring general relativity? Well, this is one of the very, very rare situations where bringing in general relativity makes a problem easier to solve.

"Will the larger pile collapse into a black hole?" is written on a piece of paper somewhere in the smaller pile.

There's a huge amount of potential energy in this scenario—the energy that we imagined would blast all these electrons apart. That energy warps space and time just like mass does.[4]If we let the energy blast the electrons apart at near the speed of light, we'd see that energy actually take the form of mass, as the electrons gained mass relativistically. That is, until something stringy happened. The amount of energy in our electron Moon, it turns out, is about equal to the total mass and energy of the entire visible universe.

An entire universe worth of mass-energy—concentrated into the space of our (relatively small) Moon—would warp space-time so strongly that it would overpower even the repulsion of those 10⁵² electrons.

Dr. Keeler's diagnosis: "Yup, black hole." But this is no an ordinary black hole; it's a black hole with a lot of electric charge.[5]The proton Earth, which would also be part of this black hole, would reduce the charge, but since an Earth-mass of protons has much less charge than a Moon-mass of electrons, it doesn't affect the result much. And for that, you need a different set of equations—rather than the standard Schwarzschild equations, you need the Reissner–Nordström ones.

In a sense, the Reissner-Nordström equations compare the outward force of the charge to the inward pull of gravity. If the outward push from the charge is large enough, it's possible the event horizon surrounding the black hole can disappear completely. That would leave behind an infinitely-dense object from which light can escape—a naked singularity.

Once you have a naked singularity, physics starts breaking down in very big ways. Quantum mechanics and general relativity give absurd answers, and they're not even the same absurd answers. Some people have argued that the laws of physics don't allow that kind of situation to arise. As Dr. Keeler put it, "Nobody likes a naked singularity."

In the case of an electron Moon, the energy from all those electrons pushing on each other is so large that the gravitational pull wins, and our singularity would form a normal black hole. At least, "normal" in some sense; it would be a black hole as massive as the observable universe.[6]A black hole with the mass of the observable universe would have a radius of 13.8 billion light-years, and the universe is 13.8 billion years old, which has led some people to say "the Universe is a black hole!" (It's not.)

Would this black hole cause the universe to collapse? Hard to say. The answer depends on what the deal with dark energy is, and nobody knows what the deal with dark energy is.

But for now, at least, nearby galaxies would be safe. Since the gravitational influence of the black hole can only expand outward at the speed of light, much of the universe around us would remain blissfully unaware of our ridiculous electron experiment.

Slice through a Minkowski diagram of my life

URL: http://what-if.xkcd.com/140/

Thursday, 17 September 2015

Premeditated - 02:07

Ads by Project Wonderful! Your ad could be here, right now.

SEMPER LEGENS

URL: http://questionablecontent.net/view.php?comic=3049

Wednesday, 16 September 2015

NYC Counting On Donations To Fund Required K-12 Computer Science Programs - 11:44

theodp writes: "To ensure that every child can learn the skills required to work in New York City's fast-growing technology sector," reports the NY Times, "Mayor Bill de Blasio will announce on Wednesday that within 10 years all of the city's public schools will be required to offer computer science to all students. New York City, the Times adds, plans to spend $81 million over 10 years, half of which it hopes to raise from private sources. Earlier this year, it was announced that Microsoft would make Office 365 ProPlus available to all NYC students, and that Google would make its CS First program available to 100K NYC students who participate in after-school programs.

Read more of this story at Slashdot.

URL: http://developers.slashdot.org/story/15/09/16/0336239/nyc-counting-on-donations-to-fund-required-k-12-computer-science-programs?utm_source=rss1.0mainlinkanon&utm_medium=feed

Squirrelphone - 04:00

After a while, the squirrel starts making that beeping noise and doesn't stop until it hops back up onto the stump.

URL: http://xkcd.com/1578/

Important Feedback - 01:25

Ads by Project Wonderful! Your ad could be here, right now.

pshooo

URL: http://questionablecontent.net/view.php?comic=3048

Tuesday, 15 September 2015

APIs, Not Apps: What the Future Will Be Like When Everyone Can Code - 22:38

An anonymous reader writes: There's been a huge push over the last few years to make programming part of the core academic curriculum. Hype or not, software developer Al Sweigart takes a shot at predicting what this will be in a future where some degree of coding skill is commonplace and he has an interesting take on it: "More programmers doesn't just mean more apps in app stores or clones of existing websites. Universal coding literacy doesn't increase the supply of web services so much as increase the sophistication in how web services are used. Programming—by which I mean being able to direct a computer to access data, organize it, and then make decisions based on it— will open up not only a popular ability to make more of online services, but also to demand more. Almost every major website has an Application Program Interface (API), a formal specification for software to retrieve data and make requests similar to human-directed browsers. ... The vast majority of users don't use these APIs—or even know what an API is—because programming is something that they've left to the professionals. But when coding becomes universal, so will the expectation that websites become accessible to more than just browsers."

Read more of this story at Slashdot.

URL: http://developers.slashdot.org/story/15/09/15/2138230/apis-not-apps-what-the-future-will-be-like-when-everyone-can-code?utm_source=rss1.0mainlinkanon&utm_medium=feed

Julia for Python programmers - 19:30

One of my clients is writing software in Julia so I’m picking up the language. I looked at Julia briefly when it first came out but haven’t used it for work. My memory of the language was that it was almost a dialect of Python. Now that I’m looking at it a little closer, I can see more differences, though the most basic language syntax is more like Python than any other language I’m familiar with.

Here are a few scattered notes on Julia, especially on how it differs from Python.

Array indices in Julia start from 1, like Fortran and R, and unlike any recent language that I know of.
Like Python and many other scripting languages, Julia uses # for one-line comments. It also adds #= and =# for multi-line comments, like /* and */ in C.
By convention, names of functions that modify their first argument end in !. This is not enforced.
Blocks are indented as in Python, but there is no colon at the end of the first line, and there must be an end statement to close the block.
Julia uses elseif as in Perl, not elif as in Python.
Julia uses square brackets to declare a dictionary. Keys and values are separated with =>, as in Perl, rather than with colons, as in Python.
Julia, like Python 3, returns 2.5 when given 5/2. Julia has a // division operator, but it returns a rational number rather than an integer.
The number 3 + 4i would be written 3 + 4im in Julia and 3 + 4j in Python.
Strings are contained in double quotes and characters in single quotes, as in C. Python does not distinguish between characters and strings, and uses single and double quotes interchangeably.
Julia uses function to define a function, similar to JavaScript and R, where Python uses def.
You can access the last element of an array with end, not with -1 as in Perl and Python.

URL: http://www.johndcook.com/blog/2015/09/15/julia-for-python-programmers/

Interview with Facebook’s Peter O’Hearn - 19:00

In this post, I interview Peter O’Hearn, programming languages professor, researcher, and evangelist. Peter now works at Facebook on the Infer static analyzer, which was publicly released back in June 2015. In this interview we take a brief tour of Peter’s background (including his favorite papers) and the path that led him and Infer to Facebook. We discuss how Infer is impacting mobile application development at Facebook, and what Peter hopes it can achieve next. Peter also shares some lessons he’s learned at Facebook regarding PL research and the sometimes surprising impact PL researchers can and are having on industrial software development.

What is your academic background?

I got a BSc in Computer Science from Dalhousie University followed by MSc+PhD at Queen’s (both universities in Canada). For my PhD I was fortunate to be supervised by Bob Tennent, an early proponent of denotational semantics who emphasized strongly how semantics can influence language design. After that I held academic positions at Syracuse University, Queen Mary University of London, and University College London (where I maintain a professor position).

Can you say a little about your research?

I have worked on denotational semantics, mathematical logic, program logic, and automated program verification/analysis. My work started off rather theoretical, involving category theoretic models and the like, and progressively moved towards practice. A main career goal has been to advance reasoning about programs in a way that takes it closer to real world problems, and this has involved jumping between theoretical and practical work.

A key moment was when John Reynolds and I developed Separation Logic. This opened up new avenues to a number of previously unapproachable problems, particularly involving concurrency and involving automated verification for programs that mutate a program heap. It spurred work with Calcagno, Distefano, Berdine, Yang and others on automatic program analysis, culminating program analyses based on Separation Logic applied to industrial code bases.

In 2013 you started working at Facebook. How did that come about, and what have you been up to since?

Since 2013 I have been an Engineering Manager at Facebook, following the acquisition of the startup company I co-founded, called Monoidics Ltd, which was putting some of our program analysis research ideas into practice. In particular, Monoidics developed the Infer static analyzer, which Facebook has now made publicly available. It is technically a shape analysis, which is a kind of analysis that goes deep into the program heap to, for instance, distinguish cyclic from acyclic pointer structures. More importantly, though, Infer behaves in a compositional way, where the analysis results of a composite program are computed from the results of its parts, and this has been crucial for its deployment inside Facebook. (Ed: links to some relevant papers appear below.)

Infer is wired into the development process for our mobile apps, including the main Facebook apps for Android and iOS, Facebook Messenger, and Instagram, among others. It participates in the code review process by writing comments, potential bugs, on code modifications (diffs) when they are submitted for review by our engineers, before these diffs are committed to production. Other participants in review include human reviewers and tests run by the continuous integration system. Because Infer is compositional, it can operate incrementally, quickly producing results on a code diff even if the diff is part of a codebase in the millions of lines.

This deployment model has been hugely important for the success of the analyzer. Infer makes comments in the natural workflow of the engineers, who have been acting on the analysis reports to the tune of a fix rate hovering around 8o% in recent months. Reporting at code review time avoids the expensive human context switch that results from long-running batch mode analysis that produces buglists (say, in overnight runs); we have found batch mode deployment to be considerably more difficult to have success with.

Infer is open source, as are several other Facebook projects. Why release the code?

We at Facebook are interested in technology that will enable us to develop better software, faster.

Program verification is an area whose potential is only beginning to be tapped; there are lots of great ideas in the area, but I feel there is much more practical value to be unlocked for programmers. In the case of Infer itself, although we have had some degree of success deploying it we don’t have all the answers. The tool can improve by paying attention to problems of people outside Facebook who use it. And by putting it out there, we can get feedback and contributions from the research community. So we can have a two-way flow of ideas from people out in the world and our technical team. This will then directly impact Facebook engineers and products, as well as hopefully play a part in the advancement of verification technology generally.

In general, in order to fulfil our mission of making the world more open and connected, a lot of software needs to be built, and we can’t build it all ourselves. Our goal at Facebook is to open source as much of our technology as possible, in particular the technology we feel would be valuable for the broader engineering community at large. ¹ If we have something that fundamentally improves our ability to build valuable software, we’re eager to share it with the world so we can get feedback to that work and also so that other engineers can leverage the work we’ve done to allow us all to move fast and innovate.

How does your job at Facebook differ from your routine as a professor?

One big difference is the kinds of questions I consider. As an academic I can see now that I was mostly concerned with novelty: how can I find new techniques to push the frontier of knowledge forward. In industry I ask myself: how can we help programmers do their jobs better. Whether the answer involves novelty is not as important as whether we can actually help people. I personally like both kinds of question, but it is a real difference of perspective and now I am of the opinion that academic computer science estimates the significance of novelty maybe a little too much. From an industrial perspective we would have more to go on if there were more corroboration and less uncorroborated novelty.

Another big difference is how working at FB we are judged on clear impact — how do our techniques affect the code in apps used by over a billion people? This is very different from considering potential, of perhaps many years out. We can and do consider long-range questions, but questions about immediate impact are there, as is only right in this setting. This makes the work exhilarating, and challenging.

Finally, a real eye-opener has been how evidence-based Facebook engineering is. We measure as much as we can to inform our engineering judgments. For Infer we use a number of performance measures, which involve both CPU-time of the analysis and the resource it takes within the continuous integration ecosystem. We also measure the rate at which developers respond to Infer’s comments (the fix rate), which is a much easier quantity to measure than false positive rate (unless you redefine the false positive concept in terms of fix rate).

Do you have any lessons learned from your time at Facebook to share with programming languages researchers?

I would send a positive message to PL researchers concerning how well the field has been doing. PL research has provided languages that people in industry actually use, such as OCaml and Haskell, but even more importantly concepts from the PL research community are showing up in recent industrial languages like Rust, Swift, Go, Hack, and Flow. It is really a great time for seeing PL research impact. And, beyond the explicit languages themselves are programming idioms. For example, I have seen functional programming idioms show up in industrial code bases for languages that are not usually thought of as functional; the programming ideas themselves can transport into many languages. This sort of idea-flow does not require the academics to make industrial-grade implementations.

In fact, I am personally of the opinion that construction of prototypes is often fine for the research community to focus on. To make a more robust implementation usually costs much more than a research team can or should afford. Yes, there are some important counterexamples (e.g., OCaml), but I would not advise the academic community to necessarily always shoot for “reliable, usable, working artifacts”. That is, at least in the way I understand the meaning of these terms from the context of having a tool that users can rely on. To require people to try to make industrial-grade tools could have the detrimental effect of slowing things down and stifling the development of fresh ideas.

Note that this is not a comment against the drive towards repeatability (which I support) or corroboration (which would be even better). But, we should be careful not to require academics to do what would take many person years to do well in industry.

What papers have you written that you are most proud of (that PL Enthusiast readers might check out)?

Let me mention three.

Local Reasoning about Programs that Alter Data Structures. Peter W. O’Hearn, John C. Reynolds, Hongseok Yang: CSL 2001: 1-19.
Resources, Concurrency and Local Reasoning. Peter O’Hearn: Theor. Comput. Sci. 375(1-3): 271-307 (2007).
Compositional Shape Analysis by Means of Bi-Abduction. Cristiano Calcagno, Dino Distefano, Peter W. O’Hearn, Hongseok Yang: J. ACM 58(6): 26 (2011).

The first is probably the best paper I have written, it is where years of thought related to program logic, AI, etc came together; includes the “principle of local reasoning” that underpins the efficiency of proof tools including Infer and others. Also, I tried to write this paper in a simple way where one did not need to be a died-in-the-wool theorist to read it; the reader can judge whether it succeeds in that!

The second is written in a tutorial style concentrating on fluency rather than metatheory, to try to get ideas across. Steve Brookes wrote a companion paper sorting out the foundations.

The last is the basis of Monoidics/Facebook Infer. More important for compositionality and the use of abduction than for shape.

What research are you working on now?

I still want to understand concurrency, scalably. I would like to have analyses that I could deploy with high speed and low friction (e.g., not copious annotations or proof hints) and produce high-quality reports useful to programmers writing concurrent programs without disturbing their workflow too much. Then it could scale to many programmers and many programs. Maybe I am asking for too much, but that is what I would like to find.

Any closing thoughts?

One thing is that I would like to say that moving to industry has made me more convinced than ever of the huge value of the education work that PL researchers engage in. I have met people who have taken programming courses from PL researchers and the undoubtedly strongly-held and coherently argued viewpoints they see in these courses ends up impacting our codebases and even some of the directions people go in internally. And I am talking about people inside Facebook trained as engineers, not researchers. The PL community needs to go on teaching the principles of programming and programming languages in a deep and uncompromising way, and we need to teach about verification and how it can be used; a lot of work is need on the latter, especially. We should do this at the undergraduate as well as graduate level. These activities can have a big impact on future software.

Follow Peter’s activities and other news about Infer at @fbinfer.

Notes:

Ed: We’ve also heard from Avik Chaudhuri from FB about Flow, and about another language, Hack, that are publicly available. ↩

The post Interview with Facebook’s Peter O’Hearn appeared first on The Programming Languages Enthusiast.

URL: http://www.pl-enthusiast.net/2015/09/15/facebooks-peter-ohearn-on-programming-languages/

Nicholas Higham on Mathematics in Color - 12:00

This is reprint of Nick Higham’s post of the same title from the Princeton University Press blog, used with permission.

Color is a fascinating subject. Important early contributions to our understanding of it came from physicists and mathematicians such as Newton, Young, Grassmann, Maxwell, and Helmholtz. Today, the science of color measurement and description is well established and we rely on it in our daily lives, from when we view images on a computer screen to when we order paint, wallpaper, or a car, of a specified color.

For practical purposes color space, as perceived by humans, is three-dimensional, because our retinas have three different types of cones, which have peak sensitivities at wavelengths corresponding roughly to red, green, and blue. It’s therefore possible to use linear algebra in three dimensions to analyze various aspects of color.

Metamerism

A good example of the use of linear algebra is to understand metamerism, which is the phenomenon whereby two objects can appear to have the same color but are actually giving off light having different spectral decompositions. This is something we are usually unaware of, but it is welcome in that color output systems (such as televisions and computer monitors) rely on it.

Mathematically, the response of the cones on the retina to light can be modeled as a matrix-vector product Af, where A is a 3-by-n matrix and f is an n-vector that contains samples of the spectral distribution of the light hitting the retina. The parameter n is a discretization parameter that is typically about 80 in practice. Metamerism corresponds to the fact that Af₁ = Af₂ is possible for different vectors f₁ and f₂. This equation is equivalent to saying that Ag = 0 for a nonzero vector g = f₁ – f₂, or, in other words, that a matrix with fewer rows than columns has a nontrivial null space.

Metamerism is not always welcome. If you have ever printed your photographs on an inkjet printer you may have observed that a print that looked fine when viewed indoors under tungsten lighting can have a color cast when viewed in daylight.

LAB Space: Separating Color from Luminosity

In digital imaging the term channel refers to the grayscale image representing the values of the pixels in one of the coordinates, most often R, G, or B (for red, green, and blue) in an RGB image. It is sometimes said that an image has ten channels. The number ten is arrived at by combining coordinates from the representation of an image in three different color spaces. RGB supplies three channels, a space called LAB (pronounced “ell-A-B”) provides another three channels, and the last four channels are from CMYK (cyan, magenta, yellow, black), the color space in which all printing is done.

LAB is a rather esoteric color space that separates luminosity (or lightness, the L coordinate) from color (the A and B coordinates). In recent years photographers have realized that LAB can be very useful for image manipulations, allowing certain things to be done much more easily than in RGB. This usage is an example of a technique used all the time by mathematicians: if we can’t solve a problem in a given form then we transform it into another representation of the problem that we can solve.

As an example of the power of LAB space, consider this image of aeroplanes at Schiphol airport.

Original image.

Suppose that KLM are considering changing their livery from blue to pink. How can the image be edited to illustrate how the new livery would look? “Painting in” the new color over the old using the brush tool in image editing software would be a painstaking task (note the many windows to paint around and the darker blue in the shadow area under the tail). The next image was produced in
just a few seconds.

Image converted to LAB space and A channel flipped.

How was it done? The image was converted from RGB to LAB space (which is a nonlinear transformation) and then the coordinates of the A channel were replaced by their negatives. Why did this work? The A channel represents color on a green–magenta axis (and the B channel on a blue–yellow axis). Apart from the blue fuselage, most pixels have a small A component, so reversing the sign of this component doesn’t make much difference to them. But for the blue, which has a negative A component, this flipping of the A channel adds just enough magenta to make the planes pink.

You may recall from earlier this year the infamous photo of a dress that generated a huge amount of interest on the web because some viewers perceived the dress as being blue and black while others saw it as white and gold. A recent paper What Can We Learn from a Dress with Ambiguous Colors? analyzes both the photo and the original dress using LAB coordinates. One reason for using LAB in this context is its device independence, which contrasts with RGB, for which the coordinates have no universally agreed meaning.

Nicholas J. Higham is the Richardson Professor of Applied Mathematics at The University of Manchester, and editor of The Princeton Companion to Applied Mathematics. His article Color Spaces and Digital Imaging in The Princeton Companion to Applied Mathematics gives an introduction to the mathematics of color and the representation and manipulation of digital images. In particular, it emphasizes the role of linear algebra in modeling color and gives more detail on LAB space.

FFT and Big Data (quotes from Princeton Companion to Applied Mathematics)

URL: http://www.johndcook.com/blog/2015/09/15/nicholas-higham-on-mathematics-in-color/

Orbital Gastronomy - 01:28

Ads by Project Wonderful! Your ad could be here, right now.

please buy my nice and good shirts

URL: http://questionablecontent.net/view.php?comic=3047

Monday, 14 September 2015

Advent - 04:00

The few dozen doors that have little Christmas trees on them are a nice touch.

URL: http://xkcd.com/1577/

Sunday, 13 September 2015

That's What It Does - 23:29

Ads by Project Wonderful! Your ad could be here, right now.

URL: http://questionablecontent.net/view.php?comic=3046

It Is Programmer Day - Why So Apathetic? - 18:56

mikejuk writes: Programmers Day comes around every year and yet each year it seems to be increasingly ignored. Why, when we are trying to encourage children to take up all things computing, is Programmers Day such a big flop? If you've not encountered it before, the idea is that on a specific day we celebrate computer programmers. It is designated to be on the 256th day of the year, which in most years is September 13th and this year, 2015, it falls on a Sunday. If you don't know why it's the 256th day, then you probably aren't a programmer and there is no point in explaining. The usual suggestions for things to do on programmer day include telling jokes and other fairly lame stuff. How about instead: Teach someone to program just a little bit.

Read more of this story at Slashdot.

URL: http://developers.slashdot.org/story/15/09/13/1526214/it-is-programmer-day---why-so-apathetic?utm_source=rss1.0mainlinkanon&utm_medium=feed

Hire a Developer, Watch Them Work In Real-Time - 09:21

New submitter alphamore writes: Live Coding, which is like Twitch for developers, has added a service that allows viewers to actually hire someone they've been watching. The aptly named 'Hire a streamer' service works exactly as it sounds. Via the profile of a developer you've seen coding on the site, a 'hire me' button lets you request their time. The service is completely opt-in for developers, so not everyone will be for-hire. When you click on the 'hire me' button, you'll be met with a list of disciplines that developer is familiar with, and their hourly rate. Once you've booked a session, the money is held in escrow (transactions happen via the site) until the developer has completed the work.

Read more of this story at Slashdot.

URL: http://developers.slashdot.org/story/15/09/13/0334225/hire-a-developer-watch-them-work-in-real-time?utm_source=rss1.0mainlinkanon&utm_medium=feed

Friday, 11 September 2015

GameStart Uses Minecraft to Teach Kids Programming (Video 1) - 21:19

You can't teach all programming by using Minecraft to keep kids interested, but you can use Minecraft, Java, and Eclipse to give them a good start. That's what Tyler Kilgore and his colleagues at GameStart are doing. Watch today's video (number 1), tomorrow's video (number 2) and read both days' transcripts for the full scoop. EDIT: "Tomorrow's video" should read, "Monday's video."

Read more of this story at Slashdot.

URL: http://developers.slashdot.org/story/15/09/11/2011245/gamestart-uses-minecraft-to-teach-kids-programming-video-1?utm_source=rss1.0mainlinkanon&utm_medium=feed

Chinese Tech Companies Hire 'Cheerleaders' To Motivate Programmers - 14:16

HughPickens.com writes: Lauren O'Neil writes at CBC News that internet companies "across China" are hiring "pretty, talented girls that help create a fun work environment." Dubbed "programming cheerleaders," these young women serve to chit-chat, play Ping-Pong with employees as part of their role, and sometimes smile and clap for male employees who play guitar in the office, as indicated by photos posted to the news service's verified "Trending in China" Facebook page. "According to the HR manager of an Internet company that hired three such cheerleaders, its programmers are mostly male and terrible at socializing," reads China.org.cn's Facebook post. "The presence of these girls have greatly improved their job efficiency and motivation." However people from all over the world have weighed in to decry the reported role. "This is degrading — both to the 'cheerleaders' and the programmers," wrote one commenter on the original post. "Look at the face of the poor woman programmer in the second picture. Stereotypical 'bro' culture only now with Chinese subtitles." Others suggest that the company pictured should simply hire more female programmers. "What a ridiculous job, why reduce women to only be valued by their looks and to assist males. Let them have a job at the desk using their minds!" wrote one woman.

Read more of this story at Slashdot.

URL: http://developers.slashdot.org/story/15/09/11/1415243/chinese-tech-companies-hire-cheerleaders-to-motivate-programmers?utm_source=rss1.0mainlinkanon&utm_medium=feed

I Could Care Less - 04:00

URL: http://xkcd.com/1576/

Thursday, 10 September 2015

Do Tech Firms Really Want Liberal Arts Majors? - 19:40

Nerval's Lobster writes: Not too long ago, a Forbes writer declared that a liberal arts degree had "become tech's hottest ticket." At so-called 'disruptive juggernauts' such as Facebook and Uber, George Anders wrote, 'the war for talent' had moved into non-technical realms such as marketing and sales. While there's undoubtedly some truth to Anders's thesis, technology recruiters and executives aren't seeing any less demand for strong technical skills in a wide variety of roles (Dice link). When there's a need for tech professionals with 'soft skills,' at least one recruiter just recruits computer-science majors from liberal arts schools, figuring those recruits will be more 'well-rounded.' To be clear, Forbes doesn't suggest that IT employers have begun mixing liberal-arts graduates into their technical teams; the article talks more about those graduates ending up in supporting roles such as sales and marketing, or else becoming intermediaries who translate the customer's product requirements into engineering solutions. But nobody should think that a strong technical background isn't as valued as ever throughout tech companies.

Read more of this story at Slashdot.

URL: http://news.slashdot.org/story/15/09/10/188206/do-tech-firms-really-want-liberal-arts-majors?utm_source=rss1.0mainlinkanon&utm_medium=feed

Same Here - 01:35

Ads by Project Wonderful! Your ad could be here, right now.

Hannelore has an electron microscope in her broom closet

URL: http://questionablecontent.net/view.php?comic=3044

Tuesday, 08 September 2015

A Byootiful Mind - 22:51

Ads by Project Wonderful! Your ad could be here, right now.

I have a NEW SHIRT for preorder! Click the image below this newspost!!!!

URL: http://questionablecontent.net/view.php?comic=3043

Node.js v4.0.0 Released - 20:05

New submitter TFlan91 writes: The first merge of the popular Node.js and io.js repositories has been released! From the announcement: "The collaborators of the Node.js project and the members of the Node.js Foundation are proud to offer v4.0.0 for general release. This release represents countless hours of hard work encapsulated in both the Node.js project and the io.js project that are now combined in a single codebase. The Node.js project is now operated by a team of 44 collaborators, 15 of which form its Technical Steering Committee (TSC). Further, over 100 new individuals have been added to the list of people contributing code to core since v0.12.7."

Read more of this story at Slashdot.

URL: http://news.slashdot.org/story/15/09/08/1936221/nodejs-v400-released?utm_source=rss1.0mainlinkanon&utm_medium=feed

What is a bug? - 20:00

Buggy software doesn’t work. According to wikipedia:

A software bug is an error … in a computer program or system that causes it to produce an incorrect or unexpected result, or to behave in unintended ways. Most bugs arise from mistakes and errors made by people in either a program’s source code or its design ...

When something is wrong with a program, we rarely hear of it having one bug — we hear of it having many bugs. I’m wondering: Where does one bug end and the next bug begin?

To answer this question, we need an operational definition of a bug, not the indirect notion present in the Wikipedia quote. ¹

This post starts to explore such a definition, but I’m not satisfied with it yet — I’m hoping you will provide your thoughts in the comments to move it forward.

Motivating scenario

My interest in distinguishing one bug from the next started with our build-it, break-it, fix-it (BIBIFI) programming contest. In the contest’s first round, builder teams write what they hope will be high quality software. In the second round, breaker teams look for bugs, including security bugs, in builders’ software. If they find a bug, they submit a test case that demonstrates it: The test case should succeed when run on our canonical “oracle” implementation, but fail on the buggy implementation. Builders lose points for each bug found, and while breakers gain points. ²

There’s one problem here: Failing test cases are evidence of bugs, but not on a one-to-one basis, since several test cases may tickle the same bug. For example, if I wrote a sorting function that fails to sort the last element of a provided list (e.g., perhaps due to an off-by-one error in the iterator), then the test that takes in=[2,1] produces out=[2,1] (when the output should be [1,2]). Another test revealing the same bug takes in=[4,3,2] produces out=[3,4,2] (when the output should be [2,3,4]). Many others tests, too, can serve as evidence of the bug. Therefore we cannot fairly equate a test case with a bug as that would over-penalize builders and over-reward breakers.

To solve this problem, I’d love some clear-cut method to determine that the two test cases are “the same” in that they are evidence of the same bug. Even if I can’t get a computer to do it, clear guidelines for humans would also be great.

But to do this I need to come up with a definition of “bug” that allows me to distinguish one bug from another.

The BIBIFI contest is not the only context in which you’d like to identify equivalent test cases. A fuzz testing tool (like AFL or CSmith) will generate many tests — we might like to weed out duplicate tests to minimize the number of bug reports, to increase the chances they are acted upon. ³ A bug bounty program (like Google’s) must determine when two submitted reports/tests identify the same bug, to reward only one submitter.

Towards a formal definition of bug

Here’s an attempt to state formally what a bug is, for the purposes of providing some insight into the question of how to distinguish bugs.

A bug is a code fragment (or lack thereof) that contributes to one or more failures. By “contributes to”, we mean that the buggy code fragment is executed (or should have been, but was missing) when the failure happens. A failure is an incorrect input/output pair (e.g., generated by a test case).

Example: Recall the sorting function that fails to sort the last element of the provided list because the iterator has an off-by-one error (on the looping condition); this bug induces the failures listed above (in=[2,1], out=[2,1], and in=[4,3,2], out=[3,4,2]).

A failure could be due to multiple bugs. This basically gets back to the main point of the post: An incorrect program might have many bugs, not just one.

Example: Suppose our sorting program has a printing routine that drops the first element of the list, when printed. Combined with the sorting off-by-one bug, we would have the failures in=[3,1,2], out=[3,2] and in=[1,3,2], out=[3,2]. (That is, we sort the first two elements of the list, leaving the third alone, and the print only the last two.)

Sometimes, the presence of one bug affects the failures or another, in which case we say the one bug interacts with the other bug.

Example: The printing bug interacts with the sorting bug, because its presence affects all failures due to (only) the sorting bug. For example, in=[4,3,2], out=[3,4,2] is masked; rather, in=[4,3,2], out=[4,2] is present, which is also wrong, but different. Conversely, the presence of the sorting bug hides some of the failures of the printing bug. For example, in=[4,4], out=[4] is a failure that is still present, but in=[5,4], out=[5] is not.

Interacting bugs

We can visualize this interaction with a Venn diagram depicting sets of failures owing to a particular bug. When bugs’ sets of failures overlap, then those bugs interact.

Non-interacting bugs

On the other hand, a bug does not interact with another bug when the failures due to the one bug are not affected by the other bug, and vice versa. In this case, the bugs’ sets of failures are non-overlapping.

A fix for a bug is a change to the problematic fragment (or is the addition of the missing fragment) that corrects the bug’s failures. However, when there are multiple bugs that interact, it may be hard to see that a fix to a single bug is efficacious, depending on the order that fixes are applied.

Example: If we first fixed the printing bug, then the failure in=[4,4], out=[4] would go away (yielding correct behavior in=[4,4], out=[4,4]), but we would still have the failure in=[4,3,2], out=[3,4,2] until we fixed the sorting bug. On the other hand, if we fixed the sorting bug first, the existence of the printing bug would still mask the improved behavior, i.e., all of our sorting test cases would still fail. That said, the failures would be different, and more illuminating about the reason for the remaining bug.

Criteria for separating bugs

Supposing that a bug is defined by wrong or missing code that results in a set of failures (i.e., test cases involving wrong input/output pairs), we still must determine which code/failures belong to one bug and which belong to another. This is where things get murky.

Fixes reveal bugs?

We might imagine that a fix could serve as evidence that multiple failures, produced by different test cases, are due to the same bug. In particular, if a single fix makes many failing test cases pass, perhaps that means these test cases should be viewed as due to the same bug?

Unfortunately, this argument is circular. It assumes that a fix applies to only one bug; but why should we assume it does? The code change could have fixed multiple bugs at once, meaning the test cases are not, in fact, the same. So this just kicks the can down the road: To determine if a fix covers multiple bugs, we need a definition of a bug that allows us to distinguish one bug from another.

Bug distinctions by code or specification granularity

I believe one important idea for defining bugs is to identify a level of granularity that serves as a frame of reference. Since bugs are in code, the level might be a code unit, like a procedure, or group of units, like a procedure and all of the procedures it calls. Another kind of granularity is not at the level of code units, but rather at the level of fragments of the specification.

Example: I could have implemented sorting as an insertion sort, with a sort routine and a insert subroutine. A bug in sorting might be viewed as a bug of insert, or a bug in the sort function; the choice might matter if there are several fragments of incorrect code in sort, in which case we might lump them all together as a single bug, or not. On the other hand, the specification of sorting might be the output array is a permutation of the input array, and the output array is ordered. It’s possible that a bug could fall into one half of this spec, or both. If the latter, then even if all of the code is in one unit (e.g., a single sort function) we might say there are two bugs. (Note that our sorting bug example — missing the last element — falls into the latter half of the above spec.)

Of course, code and spec units are somewhat arbitrary: We could write a program as one big function, perhaps with substantial redundancy, or we could write it as many smaller functions. Logical specifications could also be tight or redundant. Nevertheless, I think there is some “essence” of functionality that defines the boundaries of a bug, even if these boundaries might be a fuzzy, and depend on the beholder’s perspective.

Bug interactions complicate the situation because the visible outcome is due to multiple bugs, not one. Therefore it’s hard to tease apart the contributions of each bug. On the other hand, we can confidently say we have two different bugs if they do not interact. Even if the failures do interact, it may be that a potential bug has some non-interacting failures — i.e., its set of failures overlaps with those of another bug, but the set does not contain those failures and is not contained by it. As such, if we fix just that bug, at least these failures would go away, while the dependent failures would go away after fixing the other bug(s).

Bug characteristics

Bugs could be identified by some other element of their character, as opposed to related code or specification units. For example, there are different categories of coding mistakes, such as race conditions, off-by-one errors, div-by-zero, infinite loops, buffer overflows, etc. Even if something like a race condition involves code fragments throughout the program, we probably would consider the race a single bug, and therefore fix all of those fragments at once.

Conclusions and Questions

At this point in my thinking, what constitutes a single bug seems to be a subjective judgment whose basis roughly corresponds to program “features” as defined by distinct portions of code or the specification, or features of the coding error.

If you disagree, or see important subtleties that I have missed, please comment!

If I am right, then one question is how to nevertheless develop automated techniques for bug localization or bug triage given sets of failing and passing tests. For example, we might like to develop techniques that

Cluster failing tests into groups that correspond to distinct bugs.
Given a fix and a set of failing tests, determine whether the fix covers a single bug, or multiple bugs (which is another way of clustering the test cases).

What semantic information would be required for such computations? Traces of execution? AST-level diffs between the buggy and fixed versions?

Another question is how empirical analysis might help us refine the definition of a bug. For example, if we analyzed issue reports and resolutions (bugfixes) on Github, what would we find? Would most fixes be localized into a single code unit, supporting the idea that a bug tends to be local to a particular code unit? How often would we see duplicate or overlapping bug reports, suggesting that an initial report really involved multiple bugs, and not a single bug?

This exploration has been very interesting to me — I never would have thought that an idea as pervasive and well-known as “software bug” would be so hard to pin down!

Thanks to Alex Gyori, Piotr Mardziel, and Nikhil Swamy for interesting discussions on this topic and comments on drafts of this post.

Notes:

Andreas Zeller, in his book Why Programs Fail, prefers the term defect to bug since the latter term is sometimes used to refer to erroneous behavior, rather than erroneous code. I stick with the term bug, in this post, and use it to mean the problematic code (only). ↩
Read more about the contest in this post. ↩
One way to identify test cases that are evidence of the same bug is to minimize those tests, using something like delta-debugging or C-reduce, and then see if any are structurally identical. I posit that this might work for simple tests but will not work in general. ↩

The post What is a bug? appeared first on The Programming Languages Enthusiast.

URL: http://www.pl-enthusiast.net/2015/09/08/what-is-a-bug/

Software Is Hiring, But Manufacturing Is Bleeding - 17:57

Nerval's Lobster writes: Which tech segment added the most jobs in August? According to new data from the U.S. Bureau of Labor Statistics, tech consulting gained 7,000 positions in August, (Dice link) below July's gains of 11,100, but enough to set it ahead of data processing, hosting, and related services (which added 1,600 jobs) and computer and electronic-product manufacturing (which lost 1,800 jobs). The latest numbers reflect some longtime trends: The rise of cloud services and infrastructure has contributed to slackening demand for PCs and other hardware, eroding manufacturing jobs. At the same time, increased appetite for everything from Web developers to information-systems managers has kept employers adding positions in other technology segments. If that didn't make things difficult enough for manufacturing folks, the rise of automation has cut down on the number of manufacturing jobs available worldwide, contributing to continuing pressure on the segment as a whole, despite all the noise about bringing those jobs back to the U.S.

Read more of this story at Slashdot.

URL: http://developers.slashdot.org/story/15/09/08/1641237/software-is-hiring-but-manufacturing-is-bleeding?utm_source=rss1.0mainlinkanon&utm_medium=feed

The Free Software Foundation: 30 Years In - 14:30

An anonymous reader writes: The Free Software Foundation was founded in 1985. To paint a picture of what computing was like back then, the Amiga 1000 was released, C++ was becoming a dominant language, Aldus PageMaker was announced, and networking was just starting to grow. Oh, and that year Careless Whisper by Wham! was a major hit. Things have changed a lot in 30 years. Back in 1985 the FSF was primarily focused on building free pieces of software that were primarily useful to nerdy computer people. These days we have software, services, social networks, and more to consider. In this in-depth interview, FSF executive director John Sullivan discusses the most prominent risks to software freedom today, Richard M. Stallman, and more.

Read more of this story at Slashdot.

URL: http://yro.slashdot.org/story/15/09/08/1436246/the-free-software-foundation-30-years-in?utm_source=rss1.0mainlinkanon&utm_medium=feed

Mixing Haskell and R - 14:07

It would be hard to think of two programming languages more dissimilar than Haskell and R.

Haskell was designed to enforce programming discipline; R was designed for interactive use. Haskell emphasizes correctness; R emphasizes convenience. Haskell emphasizes computer efficiency; R emphasizes interactive user efficiency. Haskell was written to be a proving ground for programming language theorists. R was written to be a workbench for statisticians. Very different goals lead to very different languages.

When I first heard of a project to mix Haskell and R, I was a little shocked. Could it even be done? Aside from the external differences listed above, the differences in language internals are vast. I’m very impressed that the folks at Tweag I/O were able to pull this off. Their HaskellR project lets you call R from Haskell and vice versa. (It’s primarily for Haskell calling R, though you can call Haskell functions from your R code: Haskell calling R calling Haskell. It kinda hurts your brain at first.) Because the languages are so different, some things that are hard in one are easy in the other.

I used HaskellR while it was under initial development. Our project was written in Haskell, but we wanted to access R libraries. There were a few difficulties along the way, as with any project, but these were resolved and eventually it just worked.

URL: http://www.johndcook.com/blog/2015/09/08/mixing-haskell-and-r/

The Fast Fourier Transform (FFT) and big data - 12:00

The most direct way to compute a Fourier transform numerically takes O(n²) operations. The Fast Fourier Transform (FFT) can compute the same result in O(n log n) operations. If n is large, this can be a huge improvement.

James Cooley and John Tukey (re)discovered the FFT in 1965. It was thought to be an original discovery at the time. Only later did someone find a sketch of the algorithm in the papers of Gauss.

Daniel Rockmore wrote the article on the Fast Fourier Transform in The Princeton Companion to Applied Mathematics. He says

In this new world of 1960s “big data,” a clever reduction in computational complexity (a term not yet widely in use) could make a tremendous difference.

Rockmore adds a very interesting footnote to the sentence above:

Many years later Cooley told me that he believed that the Fast Fourier transform could be thought of as one of the inspirations for asymptotic algorithmic analysis and the study of computational complexity, as previous to the publication of his paper with Tukey very few people had considered data sets large enough to suggest the utility of an asymptotic analysis.

Related posts:

URL: http://www.johndcook.com/blog/2015/09/08/fft-big-data/

It's Not A Sex Toy - 03:32

Ads by Project Wonderful! Your ad could be here, right now.

But what IS it?

Also I have a new shirt for sale!

URL: http://questionablecontent.net/view.php?comic=3042

Generalization of Fibonacci ratios - 01:46

Each Fibonacci number is the sum of its two predecessors. My previous post looked at generalizing this to the so-called Tribonacci numbers, each being the sum of its three predecessors. One could keep going, defining the Tetrabonacci numbers and in general the n-Fibonacci numbers for any n at least 2.

For the definition to be complete, you have to specify the first n of the n-Fibonacci numbers. However, these starting values hardly matter for our purposes. We want to look at the limiting ratio of consecutive n-Fibonacci numbers, and this doesn’t depend on the initial conditions. (If you were determined, you could find starting values where this isn’t true. It’s enough to pick integer initial values, at least one of which is not zero.)

As shown in the previous post, the ratio is the largest eigenvalue of an n by n matrix with 1’s on the first row and 1’s immediately below the main diagonal. The characteristic polynomial of such a matrix is

λⁿ – λ^n-1 – λ^n-2 – … -1

and so we look for the largest zero of this polynomial. We can sum the terms with negative coefficients as a geometric series and show that the eigenvalues satisfy

λⁿ – 1/(2 – λ) = 0.

So the limiting ratio of consecutive n-Fibonacci numbers is the largest root of the above equation. You could verify that when n = 2, we get the golden ratio φ as we should, and when n = 3 we get around 1.8393 as in the previous post.

As n gets large, the limiting ratio approaches 2. You can see this by taking the log of the previous equation.

n = -log(2 – λ)/log(λ).

As n goes to infinity, λ must approach 2 so that the right side of the equation also goes to infinity.

URL: http://www.johndcook.com/blog/2015/09/07/generalization-of-fibonacci-ratios/

Monday, 07 September 2015

Poof - 00:26

Ads by Project Wonderful! Your ad could be here, right now.

Claire is in for a couple weeks of very sensitive noseparts.

URL: http://questionablecontent.net/view.php?comic=3041

Sunday, 06 September 2015

Get Big Fast: "500 Club" Delivers Teachers For Code.org - 19:06

theodp writes: The Waterloo Cedar Falls Courier reports that Ben Schafer, an associate CS prof at the Univ. of Northern Iowa, was recognized at Code.org's annual summit for training 570 K-12 teachers in Iowa, which is equivalent to 5.5 percent of all U.S. teachers trained. Schafer ranked No. 2 in the '500 Club', a Code.org affiliate of trainers who trained more than 500 teachers in the first year of the program. Code.org's K-5 Affiliates "deliver one-day, in-person workshops to local elementary school teachers to teach computer science in a format that's fun and accessible". A Term Sheet explains to potential Affiliates that "Code.org will pay you $50 per workshop-attendee to cover costs, including food, and to compensate you and any teaching assistants." According to a White House' Fact Sheet, Code.org plans to use $20 million in philanthropic funds to train 10,000 teachers by fall 2015 and 25,000 teachers by fall 2016. You can follow their progress on Twitter, kids!

Read more of this story at Slashdot.

URL: http://developers.slashdot.org/story/15/09/06/1427237/get-big-fast-500-club-delivers-teachers-for-codeorg?utm_source=rss1.0mainlinkanon&utm_medium=feed

Saturday, 05 September 2015

PHP 7.0 Nearing Release, Performance Almost As Good As HHVM - 22:58

An anonymous reader writes: PHP 7.0 RC2 was released on Friday. In addition to the new language features, PHP 7.0 is advertised as having twice the performance of PHP 5. Benchmarks of PHP 7.0 RC2 show that these performance claims are indeed accurate, and just not for popular PHP programs like WordPress. In tests done by Phoronix, the PHP performance was 2~2.5x faster all while consuming less memory than PHP 5.3~5.6. Facebook's HHVM implementation meanwhile still held a small performance lead, though it was consuming much more memory. PHP 7.0 is scheduled to be released in November.

Read more of this story at Slashdot.

URL: http://developers.slashdot.org/story/15/09/05/1843238/php-70-nearing-release-performance-almost-as-good-as-hhvm?utm_source=rss1.0mainlinkanon&utm_medium=feed

JetBrains Moving Its Dev Tools To Subscription Model - 16:36

esarjeant writes: For many Java developers, IntelliJ has been our predominant IDE. JetBrains is looking to make their tools easier easier to buy and use by switching to a subscription program. Their plan is to have people pay a monthly/yearly fee for access to the tools instead of upgrading when they're ready. Fortunately, if your subscription lapses it looks like you'll have 30 days to check all your stuff in. How does NetBeans look now? Many members of various developer communities are pushing back against this change: "For a developer with an unstable income, it might be perfectly fine to stay on an older version of the software until they've stashed enough cash to afford the upgrade. That will no longer work." JetBrains has acknowledged the feedback, and say they will act on it.

Read more of this story at Slashdot.

URL: http://developers.slashdot.org/story/15/09/05/1347204/jetbrains-moving-its-dev-tools-to-subscription-model?utm_source=rss1.0mainlinkanon&utm_medium=feed

Power method and Fibonacci numbers - 14:40

Take an n × n matrix A and a vector x of length n. Now multiply x by A, then multiply the result by A, over and over again. The sequence of vectors generated by this process will converge to an eigenvector of A. (An eigenvector is a vector whose direction is unchanged when multiplied by A. Multiplying by A may stretch or shrink the vector, but it doesn’t rotate it at all. The amount of stretching is call the corresponding eigenvalue.)

The eigenvector produced by this process is the eigenvector corresponding to the largest eigenvalue of A, largest in absolute value. This assumes A has a unique eigenvector associated with its largest eigenvalue. It also assumes you’re not spectacularly unlucky in your choice of vector to start with.

Assume your starting vector x has some component in the direction of the v, the eigenvector corresponding to the largest eigenvalue. (The vectors that don’t have such a component lie in an n-1 dimensional subspace, which would has measure zero. So if you pick a starting vector at random, with probability 1 it will have some component in the direction we’re after. That’s what I meant when I said you can’t start with a spectacularly unlucky initial choice.) Each time you multiply by A, the component in the direction of v gets stretched more than the components orthogonal to v. After enough iterations, the component in the direction of v dominates the other components.

What does this have to do with Fibonacci numbers? The next number in the Fibonacci sequence is the sum of the previous two. In matrix form this says

$\left[\begin{array}{c} x_{n+1} \\\ x_{n} \end{array}\right] = \left[\begin{array}{cc} 1 & 1 \\\ 1 & 0 \end{array}\right] \left[\begin{array}{c} x_{n} \\\ x_{n-1} \end{array}\right]$

The ratio of consecutive Fibonacci numbers converges to the golden ratio φ because φ is the largest eigenvalue of the matrix above.

The first two Fibonacci numbers are 1 and 1, so the Fibonacci sequence corresponds to repeatedly multiplying by the matrix above, starting with the initial vector x = [1 1]^T. But you could start with any other vector and the ratio of consecutive terms would converge to the golden ratio, provided you don’t start with a vector orthogonal to [1 φ]^T. Starting with any pair of integers, unless both are zero, is enough to avoid this condition, since φ is irrational.

We could generalize this approach to look at other sequences defined by a recurrence relation. For example, we could look at the “Tribonacci” numbers. The Tribonacci sequence starts out 1, 1, 2, and then each successive term is the sum of the three previous terms. We can find the limiting ratio of Tribonacci numbers by finding the largest eigenvalue of the matrix below.

$\left[\begin{array}{ccc} 1 & 1 & 1 \\\ 1 & 0 & 0 \\\ 0 & 1 & 0 \end{array}\right]$

This eigenvalue is the largest root of x³ – x² – x – 1 = 0, which is about 1.8393. As before, the starting values hardly matter. Start with any three integers, at least one of them non-zero, and define each successive term to be the sum of the previous three terms. The ratio of consecutive terms in this series will converge to 1.8393.

By the way, you could compute the limiting ratio of Tribonacci numbers with the following bit of Python code:

      from scipy import matrix, linalg
      M = matrix([[1, 1, 1], [1, 0, 0], [0, 1, 0]])
      print( linalg.eig(M) )

Update: The next post generalizes this one to n-Fibonacci numbers.

URL: http://www.johndcook.com/blog/2015/09/05/power-method-and-fibonacci-numbers/

Friday, 04 September 2015

It's All In The Delivery - 02:43

Ads by Project Wonderful! Your ad could be here, right now.

Claire has a really good delivery

URL: http://questionablecontent.net/view.php?comic=3040

Thursday, 03 September 2015

Right On Her Chest - 01:29

Ads by Project Wonderful! Your ad could be here, right now.

URL: http://questionablecontent.net/view.php?comic=3039

Wednesday, 02 September 2015

Neuromantic - 01:29

Ads by Project Wonderful! Your ad could be here, right now.

Claire has a very vague idea of what "cool Claire" would actually look like

URL: http://questionablecontent.net/view.php?comic=3038

Tuesday, 01 September 2015

PL conference papers to get a journal? - 18:00

As I have written previously, academic computer science differs from other scientific disciplines in its heavy use of peer-reviewed conference publications.

Since other disciplines’ conferences typically do not employ peer review, results published at highly selective computer science conferences may not be given the credit they deserve, i.e., the same credit they would receive if published in a similarly selective journal.

The main remedy has simply been to explain the situation to the possibly confused party, be it a dean or provost or a colleague from another department. But this remedy is sometimes ineffective: At some institutions, scientists risk a poor evaluation if they publish too few journal articles, but they risk muting the influence of their work in their own community if they publish too few articles at top conferences.

The ACM publications board has recently put forth a proposal that takes this problem head on by formally recognizing conference publications as equal in quality to journal publications. How? By collecting them in a special journal series called the Proceedings of the ACM.

In this post I briefly summarize the motivation and substance of the ACM proposal and provide some thoughts about it. In the end, I support it, but with some caveats. You have the opportunity to voice your own opinion via survey. You can also read other opinions for (by Kathryn McKinley) and against (by David S. Rosenblum) the proposal (if you can get past the ACM paywall, but that’s a topic for another day…).

Comparing Conferences and Journals

In disciplines like Physics, conference presentations are typically not peer reviewed — the conference is about reporting on previously published results (in a peer reviewed journal) and germinating new ideas among colleagues.

In computer science, conferences serve a similar purpose, but they are also the publication vehicle: To present at a major conference, you submit a paper that describes a substantial result, e.g., an innovation in compiler design along with an evaluation of it on a set of benchmarks. A program committee consisting of area experts chosen a the program chair reviews the paper for quality (including correctness), and only if there is sufficient consensus is the paper accepted. All of the accepted papers are published in the conference “proceedings.”

The process for top journals is similar. When a paper is submitted to a journal, an editor asks 2-3 (sometimes more) experts to review the paper for quality, and based on their reviews decides to accept the paper or not. There may be additional revisions and re-review required before finally accepting the paper.

We can see that the program chair of a conference acts like an editor at a journal, and the program committee serves as the expert reviewers. ¹ The conference review process sometimes allows reviewed revisions, too (e.g., OOPSLA’s two-phase process, or “shepherding” by PLDI), or such revisions may occur “between conferences” — i.e., a paper is rejected with the expectation it will be submitted to another conference in short order, and then re-reviewed.

It is easy to see, just by looking at the papers themselves, that the process for top conferences engenders high quality results. Quantitatively, simply reformatting a POPL or PLDI paper into ACM “Transactions” format (the format used by ACM journals) often turns a 10-12 page conference paper into 25-30 pages. Compare this to the relatively shorter journal format used in other disciplines; e.g., a paper in Nature is 5 pages long.

The Proceedings of the ACM

So, top conference papers in computer science are of the same ilk as journal papers, and should be viewed as such. How to convince the uninitiated of this? The ACM publication board’s proposal is to start collecting conference proceedings in a journal called PACM, for Proceedings of the ACM (or something else if we don’t like that name). Different communities would have their own “series” of PACM, e.g., PACM for Programming Languages. How a SIG would set up its series is somewhat up to them, but the conferences included have a sound review process, i.e., one of the sort I described above.

I should add that this proposal is particularly impactful for the PL community because of all the sub-areas of CS, programming languages is perhaps the most reliant on conference publications. Our flagship journal, TOPLAS, publishes no more than a couple of dozen papers per year (it has published 13 so far this year), while hundreds of papers appear at top conferences.

My opinion: Supportive, with caveats

In general I’m in favor of this proposal because it solves a problem for many people. But to really work it should be as streamlined as possible — conference papers already are of high quality, so why add extra hoops to fix what amounts to a marketing problem?

Here are a three elements of the current proposal that I think should be relaxed:

This process would require written reviews by a minimum number of three qualified reviewers. While three is a fine and common number, I don’t think the bar needs to be that high. Highly regarded journals like Science and Nature require two, not three, reviewers (see Nature‘s process, and Science‘s process).
The review process must permit at least one substantive revision by the authors and review of that revision by the reviewers to determine whether their concerns are adequately met. I also disagree with this requirement. While rounds of review might be useful for authors, they are not necessary for ensuring quality, which is the key motivator here. If the PC doesn’t think the paper is strong enough it can just reject it, and the paper can go to the next conference for re-review. That said, the (selective) use of shepherding ought to be sufficient to meet this requirement; as with journals, some papers have too much to fix to be shepherded, and can be rejected, while others can be changed (with vetting) in sufficient time.
Papers may not have artificial limits on length that would prevent full disclosure of references, documentation of methods, and so on. Again: Too strong. Look at Nature: it has both page limits on material, and a limit on the total number of references. I’m not against relaxing page limits or allowing more references. I’m against the claim that journal papers must not have page limits.

Here’s another element that is important to get right. At the recent SIGB meeting, Kathryn McKinley pointed out that getting the right name is crucial. PLDI, POPL, OOPSLA, ICFP, etc. are all substantial brands. Over the years, many people have come to understand that a paper from one of these conferences is likely to be very good. Therefore, we want to keep that brand as part of the name of the PACM series, so that its effect is retained.

My last point, which is not really about PACM itself but what happens if we adopt it, is that the existence of PACM should not preclude other conference/journal combinations. One attractive model is “submit to a journal, present at a conference.” As examples: You can submit a paper to the ACM TACO journal, and once accepted it can be presented at the HiPEAC conference. You can do the same with PVLDB (journal) and VLDB (conference). I think this model has a lot to speak for it, and we should explore using it for our conferences. For example, one benefit is that you have the option to present your paper at the conference, it’s not a requirement; if you’d prefer not to travel, you don’t have to. You also get rolling submission deadlines (rather than having to rewrite your paper for the slightly, or vastly, different audience of the next conference). Agreeing to PACM should not be the end of the story (and I don’t think it the proposal says that it should be).

Conclusion

I support this new ACM proposal, with caveats that I think are easily addressed. I think the proposed journal series solves a real problem, and if done right should be little to no additional work for authors, conference PC members, etc., once the ball is rolling.

Please share your thoughts in the comments below, and in the survey that ACM is taking about the idea. As ACM SIGPLAN Chair, I will take your feedback up the chain. Things are moving fast, and changes could start happening by the end of the year.

Notes:

Conference committees often solicit outside experts to fill gaps in the committee’s expertise. ↩

The post PL conference papers to get a journal? appeared first on The Programming Languages Enthusiast.

URL: http://www.pl-enthusiast.net/2015/09/01/pl-conference-papers-to-get-a-journal/

Or Unicycles, For That Matter - 02:07

Ads by Project Wonderful! Your ad could be here, right now.

Personal bias: I think everyone should get septum piercings

URL: http://questionablecontent.net/view.php?comic=3037

Secret equation - 00:49

I got a call this afternoon from someone who records audio books for the blind. He wanted to know the name of a symbol he didn’t recognize. He then asked me if the equation was real.

Here’s the equation in context, from the book Michael Vey 4: Hunt for Jade Dragon. The context is as follows.

Suddenly math problems she hadn’t understood made sense. Except now they weren’t just numbers and equations, they were patterns and colors. Calculus, geometry, and trigonometry were easy to understand, simple as a game, like shooting balls at a basketball hoop that was a hundred feet wide. Then a specific sequence of numbers, letters, and symbols started running through her mind.

$s(t; t_y) = k \frac{Q}{r^2} \hat{r} \int_{R^2} m(x, y) e^{-2\pi i \daleth\left(\frac{G_x xt + \daleth G_y yt_y}{2\pi}\right)} \,dx\,dy$

She almost said the equation when a powerful thought came over her not to speak it out loud—that she must not ever divulge it. She new that what she was receiving was something of great importance, even if she had no idea what it meant.

I believe the symbol in question is the fourth letter of the Hebrew alphabet, ℸ (daleth).

Is this a real equation? The overall form of it looks like an integral transform. However, the two instances of ℸ in equation look suspicious.

One reason is that I’ve never seen ℸ used in math, though I read somewhere that Cantor used it for the cardinality of some set. Even so, Cantor’s use wouldn’t make sense inside an integral.

Also, the two instances of ℸ are used differently. The first is a function (or else the factors of 2 π could be cancelled out) and the second one apparently is not. Finally, the equation is symmetric in x and y if you remove the two daleths. So I suspect this was a real equation with the daleths added in for extra mystery.

URL: http://www.johndcook.com/blog/2015/08/31/secret-equation/

Monday, 31 August 2015

Quivering With Anticipation - 01:48

Ads by Project Wonderful! Your ad could be here, right now.

She was at the drugstore to buy that thing at 6am

URL: http://questionablecontent.net/view.php?comic=3036

Friday, 28 August 2015

Robot Anatomy IV: A New Hope - 01:02

Ads by Project Wonderful! Your ad could be here, right now.

URL: http://questionablecontent.net/view.php?comic=3035

Thursday, 27 August 2015

Juggling projects - 10:31

Yesterday on Twitter I said I was thinking about writing the names of each of my clients and leads on balls so I could literally juggle them. I was only half joking.

I didn’t write my clients and leads on balls, but I did write them on index cards. And it helped a great deal. It’s easier to think about projects when you have physical representations you can easily move around. Moving lines up and down in an org-mode file, or even moving boxes around in 2D in OneNote, doesn’t work as well.

Electronic files are great for storing, editing, and querying ideas. But they’re not the best medium for generating ideas. See Create offline, analyze online. See also Austin Kleon’s idea of having two separate desks, one digital and one analog.

URL: http://www.johndcook.com/blog/2015/08/27/juggling-projects/

He Was Sleeping - 01:41

Ads by Project Wonderful! Your ad could be here, right now.

URL: http://questionablecontent.net/view.php?comic=3034

Wednesday, 26 August 2015

UnFayezed - 01:27

Ads by Project Wonderful! Your ad could be here, right now.

URL: http://questionablecontent.net/view.php?comic=3033

Tuesday, 25 August 2015

Casting out sevens - 12:00

A while back I wrote about a method to test whether a number is divisible by seven. I recently ran across another method for testing divisibility by 7 in Martin Gardner’s book The Unexpected Hanging and Other Mathematical Diversions. The method doesn’t save too much effort compared to simply dividing by 7, but it’s interesting. It looks a little mysterious at first, though the explanation of why it works is very simple.

Suppose you want to find whether a number n is divisible by 7. Start with the last digit of n and write a 1 under the last digit, and a 3 under the next to last digit. The digits under the digits of n cycle through 1, 3, 2, 6, 4, 5, repeatedly until there is something under each digit in n. Now multiply each digit of n by the digit under it and add the results.

For example, suppose n = 1394. Write the digits 1, 3, 2, and 6 underneath, from right to left:

1394
6231

The sum we need to compute is 1*6 + 3*2 + 9*3 + 4*1 = 6 + 6 + 27 + 4 = 43.

This sum, 43 in our example, has the same remainder when divided by 7 as the original number, 1394 in our example. Since 43 is not divisible by 7, neither is 1394. Not only that, the result of our method has the same remainder by 7 as the number we started with. In our example, 43 leaves a remainder of 1 by 7, so 1394 also leaves a remainder of 1.

You could apply this method repeatedly, though in this case 43 is small enough that it’s easy enough to see that it leaves a remainder of 1.

Suppose you started with a 1000-digit number n. Each digit is no more than 9, and is being multiplied by a number no more than 6. So the sum would be less than 54000. So you’ve gone from a 1000-digit number to at most a 5-digit number in one step. One or two more steps should be enough for the remainder to be obvious.

Why does this method work? The key is that the multipliers 1, 3, 2, 6, 4, 5 are the remainders when the powers of 10 are divided by 7. Since 10⁶ has a remainder of 1 when divided by 7, the numbers 10^6a+b and 10^b have the same remainder by 7, and that’s why the multipliers have period 6.

All the trick is doing is expanding the base 10 representation of a number and adding up the remainders when each term is divided by seven. In our example, 1394 = 1000 + 3*100 + 9*10 + 4, and mod 7 this reduces to 1*6 + 3*2 + 9*3 + 4*1, the exact calculation above.

The trick presented here is analogous to casting out nines. But since every power of 10 leaves a remainder of 1 when divided by 9, all the multipliers in casting out nines are 1.

You could follow the pattern of this method to create a divisibility rule for any other divisor, say 13 for example, by letting the multipliers be the remainders when powers of 10 are divided by 13.

Related post: Divisibility rules in hex

URL: http://www.johndcook.com/blog/2015/08/25/casting-out-sevens/

Is That A Hotdog Bun? - 01:56

Ads by Project Wonderful! Your ad could be here, right now.

I guess so

URL: http://questionablecontent.net/view.php?comic=3032

Monday, 24 August 2015

But What's THAT Guy's Story? - 02:03

Ads by Project Wonderful! Your ad could be here, right now.

I like Bubbles.

This is the final week (hopefully) of packing stuff for my big move to Canada. There may be comic interruptions but hopefully not! Thank you for your patience.

URL: http://questionablecontent.net/view.php?comic=3031

Saturday, 22 August 2015

Computing square triangular numbers - 00:59

The previous post stated a formula for f(n), the nth square triangular number (i.e. the nth triangular number that is also a square number):

((17 + 12√2)ⁿ + (17 – 12√2)ⁿ – 2)/32

Now 17 – 12√2 is 0.029… and so the term (17 – 12√2)ⁿ approaches zero very quickly as n increases. So the f(n) is very nearly

((17 + 12√2)ⁿ – 2)/32

The error in the approximation is less than 0.001 for all n, so the approximation is exact when rounded to the nearest integer. So the nth square triangular number is

⌊((17 + 12√2)ⁿ +14)/32⌋

where ⌊x⌋ is the greatest integer less than x.

Here’s how you might implement this in Python:

    from math import sqrt, floor

    def f(n):
        x = 17 + 12*sqrt(2)
        return floor((x**n + 14)/32.0)

Unfortunately this formula isn’t that useful in ordinary floating point computation. When n = 11 or larger, the result is needs more bits than are available in the significand of a floating point number. The result is accurate to around 15 digits, but the result is longer than 15 digits.

The result can also be computed with a recurrence relation:

f(n) = 34 f(n-1) – f(n-2) + 2

for n > 2. (Or n > 1 if you define f(0) to be 0, a reasonable thing to do).

This only requires integer arithmetic, so it’s only limitation is the size of the integers you can compute with. Since Python has arbitrarily large integers, the following works for very large integers.

    def f(n):
        if n < 2:
            return n
        return f(n-1)*34 - f(n-2) + 2

This is a direct implementation, though it’s inefficient because of the redundant function evaluations. It could be made more efficient by, for example, using memoization.

URL: http://www.johndcook.com/blog/2015/08/21/computing-square-triangular-numbers/

Friday, 21 August 2015

What Are The Odds - 01:32

Ads by Project Wonderful! Your ad could be here, right now.

It is her most closely-guarded secret

URL: http://questionablecontent.net/view.php?comic=3030

Thursday, 20 August 2015

When is a triangle a square? - 12:00

Of course a triangle cannot be a square, but a triangular number can be a square number.

A triangular number is the sum of the first so many positive integers. For example, 10 is a triangular number because it equals 1+2+3+4. These numbers are called triangle numbers because you can form a triangle by having a row of one coin, two coin, three coins, etc. forming a triangle.

The smallest number that is both triangular and square is 1. The next smallest is 36. There are infinitely many numbers that are both triangular and square, and there’s even a formula for the nth number that is both a triangle and a square:

((17 + 12√2)ⁿ + (17 – 12√2)ⁿ – 2)/32

Source: American Mathematical Monthly, February 1962, page 169.

For more on triangle numbers and their generalizations, see Twelve Days of Christmas and Tetrahedral Numbers.

There is also a way to compute the square triangular numbers recursively discussed in the next post.

URL: http://www.johndcook.com/blog/2015/08/20/when-is-a-triangle-a-square/

Terrifying Backrub - 02:27

Ads by Project Wonderful! Your ad could be here, right now.

URL: http://questionablecontent.net/view.php?comic=3029

Wednesday, 19 August 2015

Robot AnAAAAAtomy - 01:40

Ads by Project Wonderful! Your ad could be here, right now.

I drove for 14 hours today so here is a robot anatomy comic! Sorry, arachnophobes. Gordon means well, I promise.

URL: http://questionablecontent.net/view.php?comic=3028

Tuesday, 18 August 2015

Basics equations of beam deflection - 12:00

In the preface to his book Strength of Materials, J. P. Den Hartog says

After the alphabet and the tables of multiplication, nothing has proved quite so useful in my professional life as these six little expressions.

The six expressions he refers to are nicknamed the vergeet-me-nietjes in Dutch, which translates to forget-me-nots in English. They are also known as Dr. Myosotis’s equations because myosotis is the genus for forget-me-nots. The equations give the angular and linear deflections of a cantilever beam.

Imagine a beam anchored at one end and free on the other, subject to one of the kinds of load: a bending moment M at the opposite end, a point force P a the opposite end, or a force w distributed over the length of the beam. The equations below give the rotation (angular deflection) and displacement (linear deflection) of the free end of the beam.

	Rotation	Displacement
Bending moment	ML/EI	ML²/2EI
Point load	PL²/2EI	PL³/3EI
Distributed load	wL³/6EI	wL⁴/8EI

Here E is the modulus of elasticity, L is the length of the beam, and I is the area moment of inertia.

URL: http://www.johndcook.com/blog/2015/08/18/beam-deflection/

Friday, 14 August 2015

Deserted island books - 13:20

You’ve probably heard someone ask someone else what books they would take to a deserted island. It’s usually implied that you’re bringing books for pleasure, not books that would help you survive on the island or leave it.

People often answer the question with a list of their favorite books, perhaps skewed in favor of long books. But I don’t think you should take books that have been your favorites in the past. You should take what you think would be your favorite books on a deserted island. I expect my tastes there would be very different than they are in my current suburban life.

I think of books that I could only read on a desert island, books that I’ve enjoyed in small portions but ordinarily would not have the patience to read cover-to-cover. For example, I’ve found portions of Donald Knuth’s series The Art of Computer Programming enjoyable and useful, but I can’t say I’ve read it all. Perhaps on a deserted island with little to do and few distractions I’d enjoy going through it carefully line by line, attempting all the exercises. I might even learn MMIX, something I can’t imagine doing under ordinary circumstances.

Along those lines, I might want to take some works by Thomas Aquinas such as his Summa Theologica or his commentaries on Aristotle. The little I’ve read of Aquinas has been profound, and more approachable than I expected. Still, I find it hard to read much of him. Alone on an island I might take the time to read him carefully.

For math, I might want to take Spivak’s differential geometry series, depending on how long I expect to be on this island. If I’m going to be there too long and I’m limited on books, I might want to take something else that’s more dense and less familiar.

For science, I’d take Gravitation by Misner, Thorne, and Wheeler. I’ve intended to read that book for many years and have started a couple times. In college I couldn’t afford this price; now I can’t afford the time.

For fiction, I’d take Patrick O’Brian’s Aubrey/Maturin series because I haven’t read it, I’ve heard it is very well written, and it’s long.

URL: http://www.johndcook.com/blog/2015/08/14/deserted-island-books/

Thursday, 13 August 2015

Still With The Man Bun - 01:37

Ads by Project Wonderful! Your ad could be here, right now.

Sven that hair

Sven

URL: http://questionablecontent.net/view.php?comic=3024

Attributing changes to numerator and denominator - 01:19

This afternoon I helped someone debug a financial spreadsheet. One of the reasons spreadsheets can be so frustrating to work with is that assumptions are hard to see. You have to click on cells one at a time to find formulas, then decode cell coordinates into their meanings.

The root problem turned out to be an assumption that sounds reasonable. You’re making two changes, one to a numerator and one to a denominator. The total change equals the sum of the results of each change separately. Except that’s not so.

At this point, a mathematician would say “Of course you can’t split the effects like that. It’s nonlinear.” But it’s worth pursuing a little further. For one thing, it doesn’t help a general audience to just say “it’s nonlinear.” For another, it’s worth seeing when it is appropriate, at least approximately, to attribute the effects this way.

You start with a numerator and denominator, N/D, then change N to N+n and change D to D+d. The total change is then (N+n)/(D+d) – N/D.

The result from only the change in the numerator is n/D. The result from only the change in denominator is N/(D+d) – N/D.

The difference between the total change and the sum of the two partial changes is

–nd/D(D+d).

The assumption that you can take the total change and attribute it to each change separately is wrong in general. But it is correct if n or d is zero, and it is approximately correct with nd is small. This can make the bug harder to find. It could also be useful when nd is indeed small and you don’t need to be exact.

Also, if all the terms are positive, the discrepancy is negative, i.e. the total change is less than the sum of the partial changes. Said another way, allocating the change to each cause separately over-estimates the total change.

URL: http://www.johndcook.com/blog/2015/08/12/debugging-financial-spreadsheet/

Tuesday, 11 August 2015

Deep Meaning - 01:35

Ads by Project Wonderful! Your ad could be here, right now.

That'sa spicy comic strip!

URL: http://questionablecontent.net/view.php?comic=3022

Monday, 10 August 2015

Whoopsie Daisy - 01:06

Ads by Project Wonderful! Your ad could be here, right now.

URL: http://questionablecontent.net/view.php?comic=3021

Saturday, 08 August 2015

Effective Computation in Physics - 12:55

Earlier this week I had a chance to talk with Anthony Scopatz and Katy Huff about their new book, Effective Computation in Physics.

JC: Thanks for giving me a copy of the book when we were at SciPy 2015. It’s a nice book. It’s about a lot more than computational physics.

KH: Right. If you think of it as physical science in general, that’s the group we’re trying to target.

JC: Targeting physical science more than life science?

KH: Yes. You can see that more in the data structures we cover which are very float-based rather than things like strings and locations.

AS: To second that, I’d say that all the examples are coming from the physical sciences. The deep examples, like in the parallelism chapter, are most relevant to physicists.

JC: Even in life sciences, there’s a lot more than sequences of base pairs.

KH: Right. A lot of people have asked what chapters they should skip. It’s probable that ecologists or social scientists are not going to be interested in the chapter about HDF5. But the rest of the book, more or less, could be useful to them.

JC: I was impressed that there’s a lot of scattered stuff that you need to know that you’ve brought into one place. This would be a great book to hand a beginning grad student.

KH: That was a big motivation for writing the book. Anthony’s a professor now and I’m applying to be a professor and I can’t spend all my time ramping students up to be useful researchers. I’d rather say “Here’s a book. It’s yours. Come to me if it’s not in the index.”

JC: And they’d rather have a book that they could access any time than have to come to you. Are you thinking of doing a second edition as things change over time?

AS: It’s on the table to do a second edition eventually. Katy and I will have the opportunity if the book is profitable and the material becomes out of date. O’Reilly could ask someone else to write a second edition, but they would ask us first.

JC: Presumably putting out a second edition would not be as much work as creating the first one.

KH: I sure hope not!

AS: There’s a lot of stuff that’s not in this book. Greg Wilson jokingly asked us when Volume 2 would come out. There may be a need for a more intermediate book that extends the topics.

KH: And maybe targets languages other than Python where you’re going to have to deal with configuring and building, installing and linking libraries, that kind of stuff. I’d like to cover more of that, but Python doesn’t have that problem!

JC: You may sell a lot of books when the school year starts.

KH: Anthony and I both have plans for courses based around this book. Hopefully students will find it helpful.

JC: Maybe someone else is planning the same thing. It would be nice if they told you.

AS: A couple people have approached us about doing exactly that. Something I’d like to see is for people teaching courses around it to pull their curriculum together.

JC: Is there a web site for the book, other than an errata page at the publisher?

KH: Sure, there’s physics.codes. Anthony put that up.

AS: There’s also a GitHub repo, physics-codes. That’s where you can find code for all the examples, and that should be kept up to date. We also have a YouTube channel.

JC: When did y’all start writing the book?

AS: It was April or May last year when we finally started writing. There was a proposal cycle six or seven months before that. Katy and I were simultaneously talking to O’Reilly, so that worked out well.

KH: In a sense, the book process started for me in graduate school with The Hacker Within and Software Carpentry. A lot of the flows in the book come from the outlines of Hacker Within tutorials and Software Carpentry tutorials years ago.

AS: On that note, what happened for me, I took those tutorials and turned them into a masters course for AIMS, African Institute for Mathematical Sciences. At the end I thought it would be nice if this were a book. It didn’t occur to me that there was a book’s worth of material until the end of the course at AIMS. I owe a great debt to AIMS in that way.

JC: Is there something else you’d like to say about the book that we haven’t talked about?

KH: I think it would be a fun exercise for someone to try to determine which half of the chapters I wrote and which Anthony wrote. Maybe using some sort of clustering algorithm or pun detection. If anyone wants to do that sort of analysis, I’d love to see if you guess right. Open competition. Free beer from Katy if you can figure out which half. We split the work in half, but it’s really mixed around. People who know us well will probably know that Anthony’s chapters have a high density of puns.

AS: I think the main point that I would like to see come across is that the book is useful to a broader audience outside the physical sciences. Even for people who are not scientists themselves, it’s useful to describe the mindset of physical scientists to software developers or managers. That communication protocol kinda goes both ways, though I didn’t expect that when we started out.

JC: I appreciate that it’s one book. Obviously it won’t cover everything you need to know. But it’s not like here’s a book on Linux, here’s a book on git, here are several books on Python. And some of the material in here isn’t in any book.

KH: Like licensing. Anthony had the idea to add the chapter on licensing. We get asked all the time “Which license do you use? And why?” It’s confusing, and you can get it really wrong.

* * *

Check out Effective Computation in Physics. It’s about more than physics. It’s a lot of what you need to know to get started with scientific computing in Python, all in one place.

Related:

Scientific computing in Python (notes from SciPy 2015)
Python resources
Winston Churchill, Bessie Braddock, and Python

URL: http://www.johndcook.com/blog/2015/08/08/effective-computation-in-physics/

Thursday, 06 August 2015

Noncombatant - 01:33

Ads by Project Wonderful! Your ad could be here, right now.

I spent three hours this morning writing out the ramifications of AI soldiers in the QC universe. I have no idea if it will ever make its way into the comic, but I am a firm believer in working out the behind the scenes mechanics of your universe even if you never display them (in fact, you are probably better off NOT displaying them, as it is often unneccessary).

FYI since people keep asking, QC and Alice are not the same universe.

URL: http://questionablecontent.net/view.php?comic=3019

Wednesday, 05 August 2015

Last digit of largest known prime - 21:44

In my previous post, we looked at the largest known prime, P = 2^57885161 – 1, and how many digits it has in various bases. This post looks at how to find the last digit of P in base b. We assume b < P. (If b = P then the last digit is 0, and if b > P the last digit is P.)

If b is a power of 2, we showed in the previous post that the last digit of P is b-1.

If b is odd, we can find the last digit using Euler’s totient theorem. Let φ(b) be the number of positive integers less than b and relatively prime to b. Then Euler’s theorem tells us that 2^φ(b) ≡ 1 mod b since b is odd. So if r is the remainder when 57885161 is divided by φ(b), then the last digit of Q = P+1 in base b is the same as the last digit of 2^r in base b.

For example, suppose we wanted to know the last digit of P in base 15. Since φ(15) = 8, and the remainder when 57885161 is divided by 8 is 1, the last digit of Q base 15 is 2. So the last digit of P is 1.

So we know how to compute the last digit in base b if b is a power or 2 or odd. Factor out all the powers of 2 from b so that b = 2^k d and d is odd. We can find the last digit base 2^k, and we can find the last digit base d, so can we combine these to find the last digit base b? Yes, that’s exactly what the Chinese Remainder Theorem is for.

To illustrate, suppose we want to find the last digit of P in base 12. P ≡ 3 mod 4 and P ≡ 1 mod 3, so P ≡ 7 mod 12. (The numbers are small enough here to guess. For a systematic approach, see the post mentioned above.) So the last digit of P is 7 in base 12.

If you’d like to write a program to play around with this, you need to be able to compute φ(n). You may have an implementation of this function in your favorite environment. For example, it’s sympy.ntheory.totient in Python. If not, it’s easy to compute φ(n) if you can factor n. You just need to know two things about φ. First, it’s a multiplicative function, meaning that if a and b are relatively prime, then φ(ab) = φ(a) φ(b). Second, if p is a prime, then φ(p^k) = p^k – p^k-1.

URL: http://www.johndcook.com/blog/2015/08/05/last-digit-of-largest-known-prime/

Playing with the largest known prime - 14:26

The largest known prime at the moment is P = 2^57885161 – 1. This means that in binary, the number is a string of 57,885,161 ones.

You can convert binary numbers to other bases that are powers of two, say 2^k, by starting at the right end and converting to the new base in blocks of size k. So in base 4, we’d start at the right end and convert pairs of bits to base 4. Until we get to the left end, all the pairs are 11, and so they all become 3’s in base four. So in base 4, P would be written as a 1 followed by 28,942,580 threes.

Similarly, we could write P in base 8 by starting at the right end and converting triples of bits to base 8. Since 57885161 = 3*19295053 + 2, we’ll have two bits left over when we get to the left end, so in base 8, P would be written as 3 followed by 19,295,053 sevens. And since 57885161 = 4*14471290 + 1, the hexadecimal (base 16) representation of P is a 1 followed by 14,471,290 F’s.

What about base 10, i.e. how many digits does P have? The number of digits in a positive integer n is ⌊log₁₀n⌋ + 1 where ⌊x⌋ is the greatest integer less than x. For positive numbers, this just means chop off the decimals and keep the integer part.

We’ll make things slightly easier for a moment and find how many digits P+1 has.

log₁₀(P+1) = log₁₀(2^57885161) = 57885161 log₁₀(2) = 17425169.76…

and so P+1 has 17425170 digits, and so does P. How do we know P and P+1 have the same number of digits? There are several ways you could argue this. One is that P+1 is not a power of 10, so the number of digits couldn’t change between P and P+1.

Update: How many digits does P have in other bases?

We can take the formula above and simply substitute b for 10: The base b representation of a positive integer n has ⌊log_bn⌋ + 1 digits.

As above, we’d rather work with Q = P+1 than P. The numbers P and Q will have the same number of digits unless Q is a power of the base b. But since Q is a power of 2, this could only happen if b is also a power of two, say b = 2^k, and k is a divisor of 57885161. But 57885161 is prime, so this could only happen if b = Q. If b > P, then P has one digit base b. So we can concentrate on the case b < Q, in which case P and Q have the same number of digits. The number of digits in Q is then 57885161 log_b(2) + 1.

If b = 2^k, we can generalize the discussion above about bases 4, 8, and 16. Let q and r be the quotient and remainder when 57885161 is divided by k. The first digit is r and the remaining q digits are b-1.

Next post: Last digit of largest known prime

URL: http://www.johndcook.com/blog/2015/08/05/largest-known-prime/

Friendly Fire - 01:11

Ads by Project Wonderful! Your ad could be here, right now.

No more comics about smooching, just comics about ROBOT FIGHT from now on forever

URL: http://questionablecontent.net/view.php?comic=3018

Tuesday, 04 August 2015

Last digits of Fibonacci numbers - 12:38

If you write out a sequence of Fibonacci numbers, you can see that the last digits repeat every 60 numbers.

The 61st Fibonacci number is 2504730781961. The 62nd is 4052739537881. Since these end in 1 and 1, the 63rd Fibonacci number must end in 2, etc. and so the pattern starts over.

It’s not obvious that the cycle should have length 60, but it is fairly easy to see that there must be a cycle. There are only 10*10 possibilities for two consecutive digits. Since the Fibonacci numbers are determined by a two-term recurrence, and since the last digit of a sum is determined by the sum of the last digits, the sequence of last digits must repeat eventually. Here “eventually” means after at most 10*10 terms.

Replace “10” by any other base in the paragraph above to show that the sequence of last digits must be cyclic in any base. In base 16, for example, the period is 24. In hexadecimal notation the 25th Fibonacci number is 12511 and the 26th is 1DA31, so the 27th must end in 2, etc.

Here’s a little Python code to find the period of the last digits of Fibonacci numbers working in any base b.

from sympy import fibonacci as f

def period(b):
    for i in range(1, b*b+1):
        if f(i)%b == 0 and f(i+1)%b == 1:
            return(i)

This shows that in base 100 the period is 300. So in base 10 the last two digits repeat every 300 terms.

The period seems to vary erratically with base as shown in the graph below.

URL: http://www.johndcook.com/blog/2015/08/04/last-digits-fibonacci-numbers/

Fite Nite - 02:11

Ads by Project Wonderful! Your ad could be here, right now.

Never bring a goofy antenna to a robot fight

URL: http://questionablecontent.net/view.php?comic=3017

Jupiter Descending - 00:00

Jupiter Descending

If you did fall into Jupiter's atmosphere in a submarine, what would it actually look like? What would you see before you melted or burned up?

—Ada Munroe

We don't know! We've only flown spacecraft into a gas planet's atmosphere twice (both Jupiter). One had no cameras, and one went in at night (and was being disposed of, so wasn't taking pictures anyway).

If you took a submarine to Jupiter, it would burn up. Jupiter has a deep gravity well; after falling down toward it from space, you're going very fast. There's no easy way to shed that speed other than slamming into the atmosphere and letting it slow you down, but the entry is about four times faster than the Space Shuttle or Apollo Earth reentries. Only the Galileo probe has survived a Jupiter atmospheric entry.

The Galileo spacecraft which launched the probe later did the same thing, but without the heat shield.

The Galileo probe survived by being a bullet with a big heat shield, and several inches of the shield were burned away before it finished slowing down and started falling vertically by parachute. The probe only sent back about half a megabyte of data to the Galileo orbiter, and had no camera, so we don't know what it saw.

We could, I guess, try to do the same thing with a submarine—by mounting a giant heat shield on the front—but it would probably qualify as the most Kerbal vehicle ever built.

This would probably need a LOT more struts.

If our submarine survived, what would we see?[1]I mean, submarines don't usually have a lot of windows, but we at least have a periscope, right?

The best pictures we have of Jupiter's atmosphere are probably these carefully processed mosaics of the Great Red Spot, but they're still taken from very far away. At those resolutions, a huge Earth thunderstorm would appear as roughly one pixel. Trying to figure out what Jupiter's clouds would look like from those pictures is like using this to reconstruct these.

By combining the pictures with other science data, we've been able to put together a pretty good idea of what Jupiter's atmosphere is like, but there are some pretty big questions still unanswered which make it hard to be too confident about what it would actually look like. For example, you know the Great Red Spot? We don't know what all that red stuff is.

Author Michael Carroll has done a lot of thinking about planetary atmospheres, and his book Drifting on Alien Winds vividly describes of what it would be like to descend through Jupiter's clouds. He was also kind enough to answer some of my questions about Jupiter. Here's a rough sketch of what you'd see as you descended through the clouds; for more, definitely check out his book.

Galileo's data disproved the seven-layer model for Jupiter's envelope.

Jupiter's upper atmosphere (the part we would see before we died) has three main layers—an upper layer of haze and ammonia clouds, similar to cirrostratus clouds on Earth. Below that is a thick, reddish-brown ammonium hydrosulfide cloud layer. The lowest layer consists of white water clouds, which occasionally rise into towering thunderstorms that occasionally push through the middle layer.

Between these cloud layers, the air is probably pretty clear. At those levels, it would be less dense than the air on Earth, so you could see a long way. Thanks to Rayleigh scattering, the sky would be blue, and objects far off in the distance would fade to blue just like they do on Earth. But since Jupiter is so huge, we might not see the clouds disappear over the horizon; the towers might just fade off into the distance.

We don't really know how accurate these pictures are, though, until we send a camera to fly down into Jupiter's atmosphere and send pictures back.

Juno, scheduled to reach Jupiter next year, carries a pretty nice camera,[2]Thank you to Emily Lakdawalla for her helpful background about various spacecraft cameras, including her many thorough spreadsheets. which should give us slightly better-resolution pictures, but it still won't really give us a sense of what the cloud decks look like from within them. For that, we need to send in a probe with a camera—and there's nothing like that on the horizon. There's certainly more science value in visiting Europa than in satisfying our curiosity about the Jovian sky.

But with interplanetary cubesats soon to make their debut, who knows—maybe someone out there will put together a camera, heat shield, and parachute, and hitch a ride on the next outer planets mission. And then, finally, we'll get to look at Jupiter's clouds from both sides.

From up and down / and still somehow / AAAAAA METHANE SNOW IN MY EYES

URL: http://what-if.xkcd.com/139/

Monday, 03 August 2015

The Monosyllabic Raven - 19:32

David Morice rewrote Edgar Allen Poe’s poem The Raven in words of one syllable. Here is the first stanza:

Once at twelve on one night’s drear, ’twas while I, weak and tired thought here
On the words in lots of quaint and odd old tomes of mind’s lost lore,
While I dozed, so near a nap, there came but then a soft, quick tap,
As of one who made a rap, a rap at my front room’s closed door.
“‘Tis some guest,” I spoke, voice low, “who taps at my front room’s closed door,
Well, Just this, and not much more.”

Morice’s entire poem is available here.

Related post: A rewriting of The Raven that gives a mnemonic for the digits of pi.

URL: http://www.johndcook.com/blog/2015/08/03/the-monosyllabic-raven/

RIP I'm Sorry - 01:35

Ads by Project Wonderful! Your ad could be here, right now.

I am tired of Yelling Bird comics so there will be NO MORE OF THEM, goodbye you shitty bird

URL: http://questionablecontent.net/view.php?comic=3016

Friday, 31 July 2015

The success of OOP - 18:32

Allen Wirfs-Brock gave the following defense of OOP a few days ago in a series of six posts on Twitter:

A young developer approached me after a conf talk and said, “You must feel really bad about the failure of object-oriented programming.” I was confused. I said, “What do you mean that object-orient programming was a failure. Why do you think that?”

He said, “OOP was supposed to fix all of our software engineering problems and it clearly hasn’t. Building software today is just as hard as it was before OOP. came along.”

“Have you ever look at the programs we were building in the early 1980s? At how limited their functionality and UIs were? OOP has been an incredible success. It enabled us to manage complexity as we grew from 100KB applications to today’s 100MB applications.”

Of course OOP hasn’t solved all software engineering problems. Neither has anything else. But OOP has been enormously successful in allowing ordinary programmers to write much larger applications. It has become so pervasive that few programmers consciously think about it; it’s simply how you write software.

I’ve written several posts poking fun at the excesses of OOP and expressing moderate enthusiasm for functional programming, but I appreciate OOP. I believe functional programming will influence object oriented programming, but not replace it.

Related:

URL: http://www.johndcook.com/blog/2015/07/31/the-success-of-oop/

Robot Anatomy III - 01:18

Ads by Project Wonderful! Your ad could be here, right now.

Gencon is insane you guys

I am at booth 641

Come buy stuff before I sell it all

URL: http://questionablecontent.net/view.php?comic=3015

Thursday, 30 July 2015

Robot Anatomy II - 01:47

Ads by Project Wonderful! Your ad could be here, right now.

I am at Gencon! Come find me at booth 641!

URL: http://questionablecontent.net/view.php?comic=3014

Wednesday, 29 July 2015

The Book Of Hanners - 01:05

Ads by Project Wonderful! Your ad could be here, right now.

There will be non-regular updates until next Tuesday because I am going to GenCon. Thank you for your patience and understanding.

URL: http://questionablecontent.net/view.php?comic=3013

Tuesday, 28 July 2015

Comfort Food - 02:01

Ads by Project Wonderful! Your ad could be here, right now.

URL: http://questionablecontent.net/view.php?comic=3012

Jupiter Submarine - 00:00

Jupiter Submarine

What if you released a submarine into Jupiter's atmosphere? Would it eventually reach a point where it would float? Could it navigate?

—KTH

Nope! Jupiter's pressure, density, and temperature curves are different from ours. At the point in Jupiter's atmosphere where the density is high enough for a submarine to float, the pressure is high enough to crush the submarine,[1]Which makes it more dense. and the temperature is high enough to melt it.[2]Which makes it harder to drive.

Phase diagram courtesy a very hardcore unauthorized edition of the CRC Handbook of Chemistry and Physics.

But there's another problem: Jupiter is a gas giant, but submarines—as you can figure out from etymology—go under water.

Second-horse capabilities dramatically changed the global geopolitical strategic balance vis-a-vis horses.

Air and water are different. This seems straightforward enough, but they're also the same in a lot of ways. They're both "fluids," and some of the same rules apply to each. In some sense, when you look up at the sky, you're looking up from the bottom of a deep sea of air.

Things float when they're less dense than the fluid around them. This works the same for balloons in air and boats in water. The late Terry Pratchett wrote a truly beautiful passage about this, in the prologue to his book Going Postal. He says that since water is in many respects a wetter form of air,[3]Sounds reasonable enough to me. as ships sink, eventually they reach a point where the water was too dense to sink any further. This layer forms an underwater surface on which shipwrecks collect, drifting around beneath the waves but far above the sea floor:

It’s calm there. Dead calm.

Some stricken ships have rigging; some even have sails. Many still have crew, tangled in the rigging or lashed to the wheel.

But the voyages still continue, aimlessly, with no harbour in sight, because there are currents under the ocean and so the dead ships with their skeleton crews sail on around the world, over sunken cities and between drowned mountains, until rot and shipworms eat them away and they disintegrate.

Sometimes an anchor drops, all the way to the dark, cold calmness of the abyssal plain, and disturbs the stillness of centuries by throwing up a cloud of silt.

I love that passage. It's also completely wrong. Ships sink all the way to the bottom. (Sir Terry knew this, as the rest of the passage makes clear, but he's describing how ships work on Discworld, not Earth.)

Air follows the ideal gas law. The more pressure you put on it, the smaller (and denser) it gets.

Water, on the other hand, is pretty much incompressible. When you dive into the ocean, the pressure increases as you go deeper (rising by one atmosphere every 10 meters or so) but the water's density barely changes all the way down to the sea floor.

Buoyancy depends on density, not pressure. There's a point in Jupiter's atmosphere where the pressure is equal to a little more than an Earth atmosphere—which is the pressure a submarine is used to—but the air there is barely a tenth as dense as ours. A submarine in that layer would fall even faster than it would in the air on Earth.

Not sure what's going on, but you should probably fasten your seatbelts, assuming these things have seatbelts.

To reach a depth where it could "float" in Jupiter, the submarine would have to go halfway to the center of the planet, where the intense pressure turns the air into a metallic soup that's hotter than the surface of the Sun. The pressure there would be so high that not only would the submarine be crushed, the substances that make it up would probably be converted into new and exciting forms. It's hard to create those kinds of conditions in a lab, so we don't know a lot about how materials behave with that much pressure pushing down on them.

In the sea, on the other hand, the density of the fluid stays relatively constant. That means the submarine can find its appropriate pressure range and float there. In other words, submarines only work because water doesn't follow the ideal gas law.

But there's one more twist: Water sort of does obey the ideal gas law. The equations governing water under normal pressure are similar to the equation for a gas under about twenty thousand atmospheres of pressure. In a sense, this is why water seems incompressible to us—it behaves as if it's already compressed so much that an extra atmosphere or two hardly makes a difference.

So, in some ways, water and air are more similar than they seem, but in the ways that matter for a submarine, they really are different.

Which, of course, brings us back to why it's called a submarine—it operates under a "mare". A vehicle designed to operate beneath a sea of air would be called a subaerine.[5]The "gas" in "gas giant" is from the Greek word for void (χάος, kháos), so maybe a vessel (σκάφη, skáphē) that travels through a gas giant's atmosphere would be a khaoskaphe.

Which, come to think of it, is a perfectly good description of a car.

While it's popularly applied to many cars, "subaru" properly refers only to vehicles that operate under Italian cyclist Fabio Aru.

URL: http://what-if.xkcd.com/138/

Monday, 27 July 2015

Of Course He Has A Man Bun - 03:44

Ads by Project Wonderful! Your ad could be here, right now.

Been a while since we last saw Sven.

Gencon is next weekend! I will be there! There will probably be a couple filler updates because I will be so busy. Thanks for your patience and understanding.

URL: http://questionablecontent.net/view.php?comic=3011

Friday, 24 July 2015

Plausible Deniability - 01:38

Ads by Project Wonderful! Your ad could be here, right now.

The thing is, all of it was true

URL: http://questionablecontent.net/view.php?comic=3010

Thursday, 23 July 2015

Life lessons from differential equations - 14:39

Ten life lessons from differential equations:

Some problems simply have no solution.
Some problems have no simple solution.
Some problems have many solutions.
Determining that a solution exists may be half the work of finding it.
Solutions that work well locally may blow up when extended too far.
Boundary conditions are the hard part.
Something that starts out as a good solution may become a very bad solution.
You can fool yourself by constructing a solution where one doesn’t exist.
Expand your possibilities to find a solution, then reduce them to see how good the solution is.
You can sometimes do what sounds impossible by reframing your problem.

They Should Go To The Support Group - 01:19

Ads by Project Wonderful! Your ad could be here, right now.

Bubbles is a little self-conscious about her size

URL: http://questionablecontent.net/view.php?comic=3009

Wednesday, 22 July 2015

Glorified SNES Cartridge - 01:15

Ads by Project Wonderful! Your ad could be here, right now.

Faye has unconventional life goals.

URL: http://questionablecontent.net/view.php?comic=3008

Tuesday, 21 July 2015

Her Name Is CORPSE WITCH - 02:04

Ads by Project Wonderful! Your ad could be here, right now.

Everything is better with spikes

URL: http://questionablecontent.net/view.php?comic=3007

Monday, 20 July 2015

Numerators of harmonic numbers - 04:08

Harmonic numbers

The nth harmonic number, H_n, is the sum of the reciprocals of the integers up to and including n. For example,

H₄ = 1 + 1/2 + 1/3 + 1/4 = 25/12.

Here’s a curious fact about harmonic numbers, known as Wolstenholme’s theorem:

For a prime p > 3, the numerator of H_p-1 is divisible by p².

The example above shows this for p = 5. In that case, the numerator is not just divisible by p², it is p², though this doesn’t hold in general. For example, H₁₀ = 7381/2520. The numerator 7381 is divisible by 11² = 121, but it’s not equal to 121.

Generalized harmonic numbers

The generalized harmonic numbers H_n,m are the sums of the reciprocals of the first n positive integers, each raised to the power m. Wolstenholme’s theorem also says something about these numbers too:

For a prime p > 3, the numerator of H_p-1,2 is divisible by p.

For example, H_4,2 = 205/144, and the numerator is clearly divisible by 5.

Computing with Python

You can play with harmonic numbers and generalized harmonic numbers in Python using SymPy. Start with the import statement

from sympy.functions.combinatorial.numbers import harmonic

Then you can get the nth harmonic number with harmonic(n) and generalized harmonic numbers with harmonic(n, m).

To extract the numerators, you can use the method as_numer_denom to turn the fractions into (numerator, denominator) pairs. For example, you can create a list of the denominators of the first 10 harmonic numbers with

[harmonic(n).as_numer_denom()[0] for n in range(10)]

What about 0?

You might notice that harmonic(0) returns 0, as it should. The sum defining the harmonic numbers is empty in this case, and empty sums are defined to be zero.

URL: http://www.johndcook.com/blog/2015/07/19/numerators-of-harmonic-numbers/

High-dimensional integration - 02:54

Numerically integrating functions of many variables almost always requires Monte Carlo integration or some variation. Numerical analysis textbooks often emphasize one-dimensional integration and, almost as a footnote, say that you can use a product scheme to bootstrap one-dimensional methods to higher dimensions. True, but impractical.

Traditional numerical integration routines work well in low dimensions. (“Low” depends on context, but let’s say less than 10 dimensions.) When you have the choice of a well-understood, deterministic method, and it works well, by all means use it. I’ve seen people who are so used to problems requiring Monte Carlo methods that they use these methods on low-dimensional problems where they’re not the most efficient (or robust) alternative.

Monte Carlo

Except for very special integrands, high dimensional integration requires either Monte Carlo integration or some variation such as quasi-Monte Carlo methods.

As I wrote about here, naive Monte Carlo integration isn’t practical for high dimensions. It’s unlikely that any of your integration points will fall in the region of interest without some sort of importance sampler. Constructing a good importance sampler requires some understanding of your particular problem. (Sorry. No one-size-fits-all black-box methods are available.)

Quasi-Monte Carlo (QMC)

Quasi-Monte Carlo methods (QMC) are interesting because they rely on something like a random sequence, but “more random” in a sense, i.e. more effective at exploring a high-dimensional space. These methods work well when an integrand has low effective dimension. The effective dimension is low when nearly all the action takes place on a lower dimensional manifold. For example, a function may ostensibly depend on 10,000 variables, while for practical purposes 12 of these are by far the most important. There are some papers by Art Owen that say, roughly, that Quasi-Monte Carlo methods work well if and only if the effective dimension is much smaller than the nominal dimension. (QMC methods are especially popular in financial applications.)

While QMC methods can be more efficient than Monte Carlo methods, it’s hard to get bounds on the integration error with these methods. Hybrid approaches, such as randomized QMC can perform remarkably well and give theoretical error bounds.

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) is a common variation on Monte Carlo integration that uses dependent random samples. In many applications, such as Bayesian statistics, you’d like to be able to create independent random samples from some probability distribution, but this is not practical. MCMC is a compromise. It produces a Markov chain of dependent samples, and asymptotically these samples have the desired distribution. The dependent nature of the samples makes it difficult to estimate error and to determine how well the integration estimates using the Markov chain have converged.

(Some people speak of the Markov chain itself converging. It doesn’t. The estimates using the Markov chain may converge. Speaking about the chain converging may just be using informal language, or it may betray a lack of understanding.)

There is little theory regarding the convergence of estimates from MCMC, aside from toy problems. An estimate can appear to have converged, while an important region of the integration problem remains unexplored. Despite its drawbacks, MCMC is the only game in town for many applications.

Consulting

If you’d like help with high dimensional integration, or any other kind of numerical integration, please contact me to discuss how we might work together.

URL: http://www.johndcook.com/blog/2015/07/19/high-dimensional-integration/

Friday, 17 July 2015

Wolfram obeys Zawinksi’s law - 13:17

Zawinski’s law of software development:

Every program attempts to expand until it can read mail. Those programs which cannot so expand are replaced by ones which can.

The folks behind Mathematica and Wolfram Alpha announced yesterday now they too can read email.

URL: http://www.johndcook.com/blog/2015/07/17/wolfram-obeys-zawinksis-law/

HONK - 11:38

Ads by Project Wonderful! Your ad could be here, right now.

URL: http://questionablecontent.net/view.php?comic=3005

Thursday, 16 July 2015

Scientific computing in Python - 18:53

Scientific computing in Python is expanding and maturing rapidly. Last week at the SciPy 2015 conference there were about twice as many people as when I’d last gone to the conference in 2013.

You can get some idea of the rapid develop of the scientific Python stack and its future direction by watching the final keynote of the conference by Jake VanderPlas.

I used Python for a while before I discovered that there were so many Python libraries for scientific computing. At the time I was considering learning Ruby or some other scripting language, but I committed to Python when I found out that Python has far more libraries for the kind of work I do than other languages do. It felt like I’d discovered a secret hoard of code. I expect it would be easier today to discover the scientific Python stack. (It really is becoming more of an integrated stack, not just a collection of isolated libraries. This is one of the themes in the keynote above.)

When people ask me why I use Python, rather than languages like Matlab or R, my response is that I do a mixture of mathematical programming and general programming. I’d rather do mathematics in a general programming language than do general programming in a mathematical language.

One of the drawbacks of Python, relative to C++ and related languages, is speed. This is a problem in languages like R as well. However, with Python there are ways to speed up code without completely rewriting it, such as Cython and Numba. The only reliable way I’ve found to make R much faster, is to rewrite it in another language.

Another drawback of Python until recently was that data manipulation and exploration were not as convenient as one would hope. However, that has changed due to developments such as Pandas, initiated by Wes McKinney. For more on how that came to be and where it’s going, see his keynote from the second day of SciPy 2015.

It’s not clear why Python has become the language of choice for so many people in scientific computing. Maybe if people like Travis Oliphant had decided to use some other language for scientific programming years ado, we’d all be using that language now. Python wasn’t intended to be a scientific programming language. And as Jake VanderPlas points out in his keynote, Python still is not a scientific programming language, but the foundation for a scientific programming stack. Maybe Python’s strength is that it’s not a scientific language. It has drawn more computer scientists to contribute to the core language than it would have if it had been more of a domain-specific language.

* * *

If you’d like help moving to the Python stack, please let me know.

URL: http://www.johndcook.com/blog/2015/07/16/scientific-computing-in-python/

Wednesday, 15 July 2015

Her Name Is Bubbles - 02:53

Ads by Project Wonderful! Your ad could be here, right now.

I like Bubbles

URL: http://questionablecontent.net/view.php?comic=3003

Tuesday, 14 July 2015

Conspiracy Theory - 02:02

Ads by Project Wonderful! Your ad could be here, right now.

I have never seen a Northampton skateboarder land even the simplest ollie in 15 years of living here

URL: http://questionablecontent.net/view.php?comic=3002

New Horizons - 00:00

New Horizons

What if New Horizons hits my car?

—Robin Sheat

The New Horizons spacecraft is currently flying past Pluto.[1]Ever since astrophysicist Katie Mack pointed out that "New Horizons" appears in the lyrics to A Whole New World, I've gotten it stuck in my head every time I've seen something about Pluto. For the last few days, it's been giving us our first clear look at the world, and it should be making its closest approach at the moment this article is posted. Either that, or hitting your car, I guess.

STATUS: NEW HORIZONS HAS, AT LAST, CAUGHT AND DESTROYED PLUTO.

It's hard to imagine how that could happen, even if New Horizons had headed to Earth by mistake. Unless there's been an especially strange freeway accident, your car is currently within the Earth's atmosphere. All that air stops spacecraft from flying into the ground at full speed. But maybe you took a wrong turn and ended up near Charon, or maybe you drove into a freak extremely-low-pressure system, leaving no atmosphere above you. It could happen![2]It really, really couldn't.

New Horizons is about the size and weight of a grand piano, and is currently screaming along at about 14 kilometers per second. If it hit your car, it would be pretty bad for both vehicles.

How fast is 14 kilometers per second? Here's my favorite comparison for putting that speed in perspective: If you were standing at one end of a football field and fired a gun toward the other end, right while New Horizons flew past you, the spacecraft would reach the far end zone before the bullet made it to the 10-yard line.[3]In that same amount of time, a speeding car would travel about an inch.

While this play isn't legal in American football, it's permitted in rugby.

This high speed means that by this afternoon, New Horizons will be on its way out of the Pluto system,[4]People often ask why New Horizons is just doing a flyby, and not sticking around to orbit Pluto. The answer is: If you can figure out a way to do that, go for it. Pluto is really far away, and to get a probe there before your career ends, you have to go really fast. When you're going that fast, it's hard to stop. (At least, if you want to stop in one piece.) and over the coming days and weeks it will let us know what it saw today. It can't talk to Earth and take photos at the same time, so right now it's spending all its time taking pictures and gathering data.

Later today, the spacecraft will pause the data-gathering for a moment to send a brief message to Earth. No results—just, "Hey, I'm still alive". If it is still alive, that is. It's flying at terrifying speed through a part of the Solar System we've never visited. There could be, say, a bunch of small rocks there.[5]In case of disaster, New Horizons has sent back a few snapshots and data dumps right before the encounter, so at least we'll have those. Or a car.

New Horizons will send the "I'm okay" message in the afternoon, but it takes light four and a half hours to get back to Earth, so it will get here around 8:53pm Eastern US time—so if you're going to have a Pluto party, that's the time to do it. You can tune in to NASA TV to watch the nervous people in mission control wait for the signal. You'll know it worked if there's lots of cheering and hugging.

For more details on the mission, check out Emily Lakdawalla's comprehensive Planetary Society post, What to expect when you're expecting a flyby, which has dates, times, and background on all the equipment. (For up-to-the-minute coverage, her Twitter feed is probably the best place to go for updates, context, and excitement.)

So what does all this mean for your car?

Passenger cars have "crumple zones," which are areas of the car designed to fold up and absorb some of the force of an impact before it reaches the passenger cabin. Unfortunately, in a hypervelocity impact, materials like metal aren't nearly strong enough to hold together. Instead of crumpling, they splash. New Horizons and your car's crumple zone would splash as bits of them passed through each other, and the resulting spray of metal would do the same to the rest of your car. From a distance, it would probably look approximately like this.

Here's the good news: NASA will have to pay for your car. Under the Convention on the International Liability for Damage Caused by Space Object, NASA and the US government would clearly be on the hook for the damage. And, since you wouldn't be considered at fault in the accident, in most states insurance companies would be legally prohibited from raising your premiums.

The situation would be slightly complicated by the fact that this would be a nuclear accident. New Horizons flies too far from the Sun to use solar panels, so it's powered by the heat from a bunch of lumps of plutonium-238. The container holding the plutonium is sturdy, since it's designed to survive atmospheric reentry (and has done so). However, it's not designed to survive entry into a Chevy. The container and the plutonium inside it would be splattered across the landscape. The US government will not only have to replace your car, it will probably have to replace much of your neighborhood.

This has actually happened before. In 1978, the Soviet satellite Kosmos 954, which carried a nuclear reactor, reentered the atmosphere and disintegrated over Canada. The Canadian government spent millions cleaning up the radioactive debris near Yellowknife. They demanded over $6 million (CAD) from the Soviets for the cleanup, and were eventually paid $3 million.

Hopefully, New Horizons is currently flying past Pluto. But don't worry; if it somehow hits your car instead, the US government will cover things. To find out which one it is—Pluto or your car—tune in to NASA TV.

NEW HORIZONS HAS DISCOVERED ANOTHER MOON ORBITING YOUR CAR.

And watch your driveway.

URL: http://what-if.xkcd.com/137/

Monday, 13 July 2015

Legalize It - 01:26

Ads by Project Wonderful! Your ad could be here, right now.

This comic is gonna seem very dated in 5 years when it will be legal to inject DMT into your eyeballs at the mayor's office

URL: http://questionablecontent.net/view.php?comic=3001

Friday, 10 July 2015

Post-Beyonce Comedown - 04:00

Ads by Project Wonderful! Your ad could be here, right now.

3000 comics! That is a lot of comics.

URL: http://questionablecontent.net/view.php?comic=3000

Thursday, 09 July 2015

Contact info diagram - 11:49

My contact information arranged into a diagram:

Yesterday at SciPy 2015 Allen Downey did something similar for his contact info and gave me the idea for the image above.

LinkedIn doesn’t quite fit; you have to know that LinkedIn profiles stick linkedin.com/in/ before the user name.

URL: http://www.johndcook.com/blog/2015/07/09/contact-info-diagram/

Beyonce Moment - 02:10

Ads by Project Wonderful! Your ad could be here, right now.

Dang, Emily!

URL: http://questionablecontent.net/view.php?comic=2999

Wednesday, 08 July 2015

Try Them On A Mocha - 03:09

Ads by Project Wonderful! Your ad could be here, right now.

URL: http://questionablecontent.net/view.php?comic=2998

Tuesday, 07 July 2015

Crisis Of Confidence - 02:31

Ads by Project Wonderful! Your ad could be here, right now.

URL: http://questionablecontent.net/view.php?comic=2997

Multiple choice - 01:41

A certain question has the following possible answers.

All of the below
None of the below
All of the above
One of the above
None of the above
None of the above

Which answer is correct?

Source

URL: http://www.johndcook.com/blog/2015/07/06/multiple-choice/

Monday, 06 July 2015

Machiavelli - 01:52

Ads by Project Wonderful! Your ad could be here, right now.

Wolves are a kind of dog

URL: http://questionablecontent.net/view.php?comic=2996

Saturday, 04 July 2015

When the last digits of powers don’t change - 13:43

If you raise any integer to the fifth power, its last digit doesn’t change. For example, 2⁵ = 32.

It’s easy to prove this assertion by brute force. Since the last digit of bⁿ only depends on the last digit of b, it’s enough to verify that the statement above holds for 0, 1, 2, …, 9.

Although that’s a valid proof, it’s not very satisfying. Why fifth powers? We’re implicitly working in base 10, as humans usually do, so what is the relation between 5 and 10 in this context? How might we generalize it to other bases?

The key is that 5 = φ(10) + 1 where φ(n) is the number of positive integers less than n and relatively prime to n.

Euler discovered the φ function and proved that if a and m are relatively prime, then

a^φ(m) = 1 (mod m)

This means that a^φ(m) – 1 is divisible by m. (Technically I should use the symbol ≡ (U+2261) rather than = above since the statement is a congruence, not an equation. But since Unicode font support on various devices is disappointing, not everyone could see the proper symbol.)

Euler’s theorem tells us that if a is relatively prime to 10 then a⁴ ends in 1, so a⁵ ends in the same digit as a. That proves our original statement for numbers ending in 1, 3, 7, and 9. But what about the rest, numbers that are divisible by 2 or 5? We’ll answer that question and a little more general one at the same time. Let m = αβ where α and β are distinct primes. In particular, we could choose α = 2 and β = 5; We will show that

a^φ(m)+1 = a (mod m)

for all a, whether relatively prime to m or not. This would show in addition, for example, that in base 15, every number keeps the same last “digit” when raised to the 9th power since φ(15) = 8.

We only need to be concerned with the case of a being a multiple of α or β since Euler’s theorem proves our theorem for the rest of the cases. If a = αβ our theorem obviously holds, so assume a is some multiple of α, a = kα with k less than β (and hence relatively prime to β).

We need to show that αβ divides (kα)^φ(αβ)+1 – kα. This expression is clearly divisible by α, so the remaining task is to show that it is divisible by β. We’ll show that (kα)^φ(αβ) – 1 is divisible by β.

Since α and k are relatively prime to β, Euler’s theorem tells us that α^φ(β) and k^φ(β) are congruent to 1 (mod β). This implies that kα^φ(β) is congruent to 1, and so kα^φ(α)φ(β) is also congruent to 1 (mod β). One of the basic properties of φ is that for relatively prime arguments α and β, φ(αβ) = φ(α)φ(β) and so we’re done.

Exercise: How much more can you generalize this? Does the base have to be the product of two distinct primes?

URL: http://www.johndcook.com/blog/2015/07/04/when-the-last-digits-of-powers-dont-change/

Friday, 03 July 2015

Virginia Is A Good Girl - 01:21

Ads by Project Wonderful! Your ad could be here, right now.

QC is just going to be Emily talking about animals from now on

URL: http://questionablecontent.net/view.php?comic=2995

Thursday, 02 July 2015

No, I’m not a bot. - 14:54

Periodically someone on Twitter will suggest that one of my Twitter accounts is a bot. Others will reply in the second person plural, suggesting that there’s a group of people behind one of the accounts. These accounts aren’t run by a bot or a committee, just me.

I do use software to schedule my tweets in advance. Most of the tweets from my personal account are live. Most of the tweets from my topic accounts are scheduled, though some are live. All replies are manual, not automated, and I don’t scrape content from anywhere.

Occasionally I read the responses to these accounts and sometimes I reply. But with over half a million followers (total, not unique) I don’t try to keep up with all the responses. If you’d like to contact me, you can do so here. That I do keep up with.

claimtoken-5595a093e11d8

URL: http://www.johndcook.com/blog/2015/07/02/no-im-not-a-bot/

Wednesday, 01 July 2015

Canis Canis - 01:50

Ads by Project Wonderful! Your ad could be here, right now.

DOGS

URL: http://questionablecontent.net/view.php?comic=2993

Tuesday, 30 June 2015

Hy-Phen - 03:17

Ads by Project Wonderful! Your ad could be here, right now.

Today's comic is brought to you by jetlag

URL: http://questionablecontent.net/view.php?comic=2992

Sunday, 28 June 2015

#Buttrocket Forever - 21:15

Ads by Project Wonderful! Your ad could be here, right now.

Hey everyone, just a quick heads-up:

The next three months are going to be really, really, really busy for me. I'm working on two comics, of course. I'm also finishing up the next QC book. I've ALSO got a big exciting family event and the biggest convention of the year (GenCon) within days of each other at the end of July. I'm ALSO working on moving to Canada in the fall! That is a VERY COMPLEX PROCESS. And once I'm ready to move I have to pack all my stuff and get it shipped up and all SORTS of other insanely complicated things. There is more stuff going on as well but those are the big important time-consuming things.

So basically I have a lot going on, a lot more than I usually do, and if I miss a comic update here or there I would appreciate your patience and understanding. My hope is that I WON'T miss any updates, but I am trying to be realistic about everything I have to accomplish in the next couple months. QC is and will remain my first priority, and I will do my best (as always) to update on schedule. I appreciate your support and understanding.

Man I used to write about my personal life ALL THE TIME here and now it just feels super weird. I guess that is what twitter is for.

URL: http://questionablecontent.net/view.php?comic=2991

The name we give to bright ideas - 17:28

From The Book of Strange New Things:

… I said that if science could come up with something like the Jump it could surely solve a problem like that. Severin seized hold of that word, “science.” Science, he said, is not some mysterious larger-than-life force, it’s just the name we give to bright ideas that individual guys have when they’re lying in bed at night, and that if the fuel thing bothered me so much, there was nothing stopping me from having a bright idea to solve it …

URL: http://www.johndcook.com/blog/2015/06/28/the-name-we-give-to-bright-ideas/

Thursday, 25 June 2015

Breaking The First Rule - 21:55

Ads by Project Wonderful! Your ad could be here, right now.

HON HON HON

URL: http://questionablecontent.net/view.php?comic=2990

Wednesday, 24 June 2015

First Rule - 20:52

Ads by Project Wonderful! Your ad could be here, right now.

Good to see Pintsize is feeling better.

URL: http://questionablecontent.net/view.php?comic=2989

Tuesday, 23 June 2015

All Work And No Play - 21:29

Ads by Project Wonderful! Your ad could be here, right now.

hurf burf

URL: http://questionablecontent.net/view.php?comic=2987

Monday, 15 June 2015

QC Guest Week 2015: KB Spangler - 02:23

Ads by Project Wonderful! Your ad could be here, right now.

QC Guest Strip Time Woo!

Today's comic is brought to you by K.B. Spangler from A Girl And Her Fed! She is also a heck of a good author and you should check out her prose work as well!

#buttrocket

URL: http://questionablecontent.net/view.php?comic=2981

Friday, 12 June 2015

The Talk 2: Talk Harder - 02:54

Ads by Project Wonderful! Your ad could be here, right now.

This seems like a good way to wrap up the week. Next week is my birthday and I am going on vacation! I have some amazing guest comics lined up, I think you will enjoy them.

It is also the end of the second chapter of my other comic Alice Grove. Now might be a good time to jump in if you've been confused as to what the hell is going on!*

*you will probably still be confused

URL: http://questionablecontent.net/view.php?comic=2980

Thursday, 11 June 2015

Empathy - 02:43

Ads by Project Wonderful! Your ad could be here, right now.

URL: http://questionablecontent.net/view.php?comic=2979

Wednesday, 10 June 2015

That Mug Is Terrifying - 02:49

Ads by Project Wonderful! Your ad could be here, right now.

Claire is no dummy. Except when she is.

URL: http://questionablecontent.net/view.php?comic=2978

Tuesday, 09 June 2015

Interview with Mozilla’s Aaron Turon - 16:00

On the heels of Rust’s 1.0 release, we are pleased to be able to interview Mozilla’s Aaron Turon, a member of Rust’s core team (which is the leadership for the project that sets the overall direction). This is our third interview with a PL PhD working in industry.

Aaron Turon

What is your academic background?

I received my undergraduate degree at the University of Chicago, at a time where a lot of PL was happening (a lot of the folks who built or studied Standard ML were there); I did some research under John Reppy. After that, I went on to do a PhD at Northeastern University, which continues to have a thriving PL group; I was supervised by Mitchell Wand. Finally, I was a post-doc under Derek Dreyer at the Max Planck Institute for Software Systems (MPI-SWS).

What motivated Mozilla to create Rust?

We wanted a language in which you could system software, like a web browser, that has both great performance (including good responsiveness and low power usage) and a high degree of reliability and security.

All mainstream browsers are built in C++, because that’s historically been the only language that can give you the strong low-level control you need to eke out maximal performance, while working reasonably well at the scale of something like a browser. But C++ is not so great on the reliability/security front, and it’s not easy to modify a C++ codebase to leverage multicore processors to increase responsiveness and decrease power usage.

Mozilla started two projects — the language Rust, and the browser engine Servo — to try to shift the long-standing tradeoff between safety and control.

Rust was designed from the ground up to compete with the low-level control of C++, while offering memory safety guarantees found in higher-level languages, and making concurrent programming less error prone. Servo is a brand new browser engine whose core pieces are all written in Rust, which is already capable of rendering a wide range of sites in the wild, and which shows promising performance improvements and power reduction over current browsers.

Can you say a bit about Rust’s design goals?

Rust provides three key value propositions:

Memory safety without garbage collection
Concurrency without data races
Abstraction without overhead

Each of these value propositions imposes constraints on its design. For example, the lack of GC is an important constraint for applications like browsers, where a GC pause could kill UI responsiveness, and low memory footprint is highly desirable. But traditionally, managing memory directly opens the door to dangling pointers and other memory safety problems, which in turn makes exploits possible.

The key for Rust is a statically-enforced discipline of ownership and borrowing, which is at heart an affine type system. Borrowing allows you to create a reference to an object that is tied to some lexical scope, and can be passed to functions called within that scope. Getting the borrowing system to work smoothly, at scale, was probably the key turning point in Rust’s design, and took many iterations.

Ownership and borrowing leads to a memory (or really, resource) management scheme that feels as automatic as GC, has none of GC’s overhead, and guarantees memory safety (since you can only chase a pointer you have borrowed or own). Moreover, the same scheme also prevents data races (unsynchronized concurrent access to state that involves a write) because permission to mutate a value is uniquely owned by a single thread.

How has the research literature influenced Rust’s design (both positively and negatively), if at all?

A lot of Rust’s basic design was settled before I joined, but my impression is that the literature has been enormously influential. Certainly, you can trace very clear lines back to Cyclone and Haskell. The ownership and borrowing scheme draws inspiration from Cyclone and modern C++ (unique_ptr and friends). The basic form of abstraction — traits — combines the best of Haskell’s type classes and C++’s templates (cf. “How to make ad hoc polymorphism less ad hoc“). In particular, you can use a single trait in the style of bounded parametric polymorphism (generics), which results in static dispatch, or OO-style polymorphism, which results in dynamic dispatch. But ideas haven’t been simply taken “off the shelf:” Rust has benefited from repeated iteration on its design, which has meant months of breaking changes. Making these ideas practical, making them work together at scale, has taken time and experience. This iteration has been supported by an enthusiastic and patient community, which has been willing to continuously update code as the designs shifted.

When doing a full design for a language with any hope of mainstream success, you have to think very carefully about the overall complexity budget: you can only afford a certain amount of unfamiliarity in the core programming model, and it had better carry its weight. Academic work tends to focus on a particular language feature, and it’s easy to do a deep dive and wind up with something far too complicated to be plugged into an already-complicated language.

I think that tension is probably fundamental — the incentives are very different in the research world. But in general, research designs that are simple and can fit well into existing programming models are going to be much easier for us to take advantage of.

In terms of the community’s perception of academic ideas: there is an overall sense that Rust has tried to learn from both past languages and prior research, and our community sees that as very valuable. But I think, even more than that, there are certain core principles (some of which I mentioned above) that we hammer on relentlessly, and that directly motivate the use of e.g. affine types. In other words, these ideas aren’t just gimmicks or sideshows — they are the heart of the programming model — and I think people get that.

What are the current goals for the language and its ecosystem?

In the short term, we’ve been very focused on a successful 1.0 release and its aftermath. There are a number of known pain points — compiler performance being a major one — that we will need to address to smooth out the basic experience of working with Rust today.

In the longer term, I think focus is gradually going to move away from the core language (while still addressing a few known gaps, e.g., involving ownership and traits ¹) and toward the ecosystem and tooling story. We’ve made that a priority by shipping 1.0 with “Cargo”, a package manager in the vein of Ruby’s Bundler, together with crates.io, our central repository of community libraries; it’s in the community’s hands. We also plan to grow the number of platforms with “first class support”, and make Rust code easy to embed into a wide range of contexts and other languages.

We envision two major constituencies for Rust. The first is existing system programmers, largely using C or C++ today. The second is programmers working in higher-level languages, but who need to dip down to the systems level to grab the last bit of performance. Those groups in turn inform our goals: to continue to push Rust into ever-lower-level contexts, while making the language ever friendlier to high-level abstractions.

Is Mozilla open to outside contributions to the language? How can the broader community get involved, if they want to?

The language is very community-driven, with over 1,000 contributors to the 1.0 release. The enthusiasm, know-how, and investment coming from volunteers world-wide blows me away on a regular basis. If you are interested in helping, there are a number of welcoming forums that you can find on the Rust home page and http://www.ncameron.org/rust.html.

One of my favorite aspects of the Rust community/governance model is our RFC process: all major changes to the language or standard library first go through a written design document and community-wide discussion, before finally being approved/declined by a team of people in charge of a particular area. In academia I always found that I uncovered a lot when actually writing up a paper, and the RFC process winds up working out in much the same way.

Shifting gears a bit: Can you say more about how you ended up at Mozilla?

Originally, I was pretty strongly on an academic career path but ended up changing course, due to a lot of factors.

Most fundamentally, I did not feel that I could (personally!) manage to maintain a satisfying career as a traditional academic while retaining sufficient time and energy for my family; I was never able to maintain boundaries I was happy with. Moreover, my passion for problem solving and design work is much greater than for teaching, reviewing, and so on — but these latter duties by themselves already constitute a full time job, if you want to do them well.

On the other hand, it gradually became clear that I could do perfectly satisfying work outside the realm of traditional academia.

My current position is in Mozilla Research, and my title includes the word ‘research’, so I haven’t escaped altogether. But it’s true that our group puts much more emphasis on creating viable (but research-driven) products, rather than pure research. We do hope to publish a series of papers on Rust in the next few years and I continue to give talks and collaborate with more academic groups.

Personally, I’ve found this to be a good fit, and I feel very fortunate to have ended up here. I have opportunities to do genuine, deep research, but I am also fitting that into the context of a viable language with a thriving community. It’s thrilling to author RFCs that are widely read and responded to by the Rust community, and there is much more of a sense of shared values and concrete problems we are all working to solve than I experienced in academia.

In the end, the chance to work on a project with a real shot at becoming a major part of the tech landscape, but which also values research ideas and methodology, was too good to pass up.

What is your view of the value of a PhD, particularly from the perspective of working at a company like Mozilla?

I’m not sure I can speak to Mozilla broadly, but as a member of the Rust team, the skills I gained in academia are indispensable. Being able to boil down a problem to its fundamental constraints, being able to communicate clearly in speech and writing, being able to pitch an idea, being confident that I can learn anything given enough time — these are skills I rely on every day. While it’s true that I don’t have the breadth of engineering experience that some of my amazing peers do, I feel that my academic background has given me complementary strengths that have made me an effective part of the team.

That’s not to say that everyone should go get a PhD, of course — just that what I learned in the process has yielded ongoing value, even outside of a traditional research setting.

Follow Aaron on Twitter at @aaron_turon.

Notes:

For language afficionados, here’s some more detail: For ownership, the borrowing system is currently limited to purely lexical scopes, but this is not an essential restriction and we hope to generalize borrowing to accommodate patterns that run afoul of it. For traits, the biggest limitation is the lack of higher-kinded types: you cannot write a trait that abstracts over a type constructor. This limitation is painful in part because type constructors in Rust are often parameterized over “lifetimes”, which represent the scope of some borrow. The inability to abstract over these kinds of constructors makes it hard to talk, generically, about many basic patterns of ownership. We are also interested in supporting selectable static/dynamic dispatch based on return values (not just input values, as is the case now), which will require something like existential types that are monomorphized at compile time. More speculatively, there are certain use-cases (e.g., the DOM) that derive strong benefits from the reuse made possible by class-based inheritance, which Rust currently does not support. We are examining whether we can achieve this reuse by extending existing features, or whether a direct treatment is warranted. ↩

The post Interview with Mozilla’s Aaron Turon appeared first on The Programming Languages Enthusiast.

URL: http://www.pl-enthusiast.net/2015/06/09/interview-with-mozillas-aaron-turon/

Chain Of Command - 04:41

Ads by Project Wonderful! Your ad could be here, right now.

All three coffees are for Faye

URL: http://questionablecontent.net/view.php?comic=2977

Monday, 08 June 2015

Storm's A-Comin' - 01:52

Ads by Project Wonderful! Your ad could be here, right now.

NYCC Special Edition was a great time! Thanks so much to everyone who came out.

URL: http://questionablecontent.net/view.php?comic=2976

Friday, 05 June 2015

Or A Multitool - 02:20

Ads by Project Wonderful! Your ad could be here, right now.

I will be at NYCC Special Edition this weekend! Come find me at the Topatoco table! WOOO

URL: http://questionablecontent.net/view.php?comic=2975

Thursday, 04 June 2015

Millions Of Peaches - 01:48

Ads by Project Wonderful! Your ad could be here, right now.

I will be at NYCC Special Edition this weekend! Come say hi!!!

URL: http://questionablecontent.net/view.php?comic=2974

Wednesday, 03 June 2015

Steps One Through X - 01:10

Ads by Project Wonderful! Your ad could be here, right now.

Bad decisions are sometimes not quantized

URL: http://questionablecontent.net/view.php?comic=2973

Tuesday, 02 June 2015

The PL Enthusiast Turns One! - 16:07

Our first post to this blog appeared on June 2, 2014, so that makes today our one year anniversary!

It’s the PL Enthusiast’s 1-year birthday

In celebration of the occasion, this post takes a look back over the last year as well as a brief look to what’s ahead.

We’d love your feedback about what you’ve seen so far, and what you’d like to see (or contribute!) next.

Snapshot, by the numbers

Our posts over the last year have comprised a variety of topics, covering PL people, research, process, and practice. In particular, we’ve written 41 posts in total, comprising the following topics:

PL topic overviews (the “What is … ?” series) (5)
PL research and its connections other areas (8)
PL in practice (3)
PL in education (7)
Process of science, and research analytics (7)
Interviews with young scientists, professors, and practitioners (8)
Remembrances, other (3)

As of June 1, 2015, these posts have been viewed 164,512 times, according to Google Analytics. Our most popular posts were the following:

What is probabilistic programming? (26,354 pageviews)
What is type safety? (26,319)
What is memory safety? (21,201)
Interview with Go’s Russ Cox and Sameer Ajmani (12,262)
Who teaches functional programming? (12,218)
Dynamic Software Updating: Linux 4.0 and Beyond (6,068)
What is PL research and how is it useful? (5,496)
Why are some languages adopted and others aren’t? (3,924)
Bridging Algorithms and Programming Languages (3,091)
What is noninterference, and how do we enforce it? (2,571)

All five of the general-overview posts (1,2,3,6,10) make the top ten, which is reassuring since they were directed at a broader audience. Two posts looking at PL research and its connections to other areas (7,9), one looking at PL in education (5), and one looking at PL in practice (8) also make the top 10. My interview with Russ Cox and Sameer Ajmani (4) rounds out the list.

Looking back

Here I have compiled a summary of the topic areas listed above. Think of it as an elaborated table of contents for the last year. Afterward I elaborate on plans for next year, and call for your contributions.

PL Topic overviews (the “What is .. ?” series)

Type safety: Well-typed programs are well-defined

Several posts begin with the phrase “What is … ?” This series started while I was working on my Coursera course on software security, trying to explain memory safety. When I determined there wasn’t a clear answer, Swarat encouraged me to turn my exploration to find one into a blog post. The post on memory safety was followed up by posts on type safety, probabilistic programming, noninterference, and dynamic software updating.

PL research and its connections to other areas

Connecting PL methods and implementations to those in cryptography

Connections between PL and Crypto

A goal of this blog is to highlight the connections between PL research and other areas, while also discussing core theoretical and practical developments. Several times we have looked at connections between PL and Cryptography, first looking at securing computation, and then looking at PL-Crypto synergies more generally, particularly in the ways the two communities perform formal reasoning. We have also looked at bridging PL and algorithms research. My most recent post has looked at PL research generally, showing how it is impacting many areas of computer science.

Views from the field: PL in practice and in education

We have looked in several ways at how programming languages, and ideas underlying their design and use, are penetrating industry and education.

Heartbleed

We looked at research on programming language adoption and the IEEE’s top language list. We have also looked closely at the Heartbleed SSL bug and how PL techniques might have been brought to bear to prevent it.

On the education side, Swarat surveyed whether universities teach functional programming; FP has historically been a font of ideas that have penetrated the mainstream. Swarat also detailed his experience (in two parts) introducing ideas about formal verification into the undergraduate curriculum, while taking a broader, two part look at PL research’s impact in K-16 education.

As an activity at the intersection of education and practice, I introduced the Build-it, Break-it, Fix-it programming contest, meant to study at “what works” for creating high-quality software while providing a stimulating educational experience. Our first contest had some interesting outcomes, and the next iteration is currently underway. There will be more to say about the results of this contest, in future posts.

Process of scientific research

Readers of the PL Enthusiast include scientists, and those interested in the process of science, especially as applied to work on programming languages. As such, we wrote posts about the process of peer review, and provided advice on writing peer reviews and author response (a relatively recent mechanism in CS peer review processes). We also previewed SNAPL, a new kind of PL research conference (we hope to write a SNAPL retrospective soon).

POPL collaboration graph

Swarat also had a series of posts with an analytics-based view of the PL, and CS, research communities. Most recently, he developed an app for ranking CS research productivity. Previously, he looked at author-based collaboration graphs for PLDI and OOPSLA, and for POPL.

Interviews

PL researchers are having an impact in many ways. We decided to interview some of them to find out what they are up to.

Ravi Chugh

Our first set of interviews were with recent PhDs having just started their first academic job. We asked them about their work, their job search, their plans, and for advice to those in a similar position. Our subjects were Emina Torlak, Aws Albagouthi, Ravi Chugh, Cindy Rubio-Gonzalez, and Bill Harris. We expect to carry on the series with this year’s new professors.

Stephen Freund

We also interviewed Steve Freund about life as a professor at Williams, which is different than a traditional research university in that it’s focused on undergraduate education while still encouraging groundbreaking research.

Avik Chaudhuri

More recently we have begun a series of interviews with PhDs working in industry on PL topics. We discussed how they got to where they are, what they have worked on and are working on, and their current perspective about PhDs and PL. We started with Facebook’s Avik Chaudhuri, responsible for the Flow extension to Javascript, and followed up with Google’s Russ Cox and Sameer Ajmani, who are part of of the Go development team. We have several more of these interviews in the works.

Remembrances and Reviews

Over the past year, two pillars of the field, Susan Horwitz and Radhia Cousot, passed away, and so we posted remembrances of them and their work. Swarat wrote Radia’s remembrance, and guest Tom Ball wrote about Susan Horwitz. We also wrote a book review, on Copeland’s biography, Turing: Pioneer of the Information Age, on account of the popular Alan Turing bio-pic, The Imitation Game.

Call for contributions

Our plan is to continue to write posts on our own but we’d like your help. Are you interested in a topic that we haven’t covered yet? For example, do you have ideas for posts on the following:

Overview topics (e.g., another installment of the “What is …?” series)
Emerging or interesting research topics or connections
Impacts and developments of PL in (CS, and other) education, and industry
“Meta” topics on research and the process of research
People (or kinds of people) to interview
Books to review
Other kinds of posts we have so far neglected (e.g., trip reports)

If you do, please send them our way! Probably 1/4 of the posts that have appeared this year have been developed based on from suggestions from readers.

Also, if you might be interested in authoring a post, or you would like to suggest someone that we invite to do so, please send us an e-mail. While Swarat and I plan to remain as co-editors of the blog, we are interested in expanding the list of contributors. Outside contributions, especially from Tom Ball (on Susan Horwitz) and our interviewees, have been excellent, and we’d like even more.

We’re looking forward to another great year!

The post The PL Enthusiast Turns One! appeared first on The Programming Languages Enthusiast.

URL: http://www.pl-enthusiast.net/2015/06/02/the-pl-enthusiast-turns-one/

The Handoff - 01:57

Ads by Project Wonderful! Your ad could be here, right now.

Marten is wise to her ways.

NYCC Special Edition is this weekend! I will be there! Come say hi!

URL: http://questionablecontent.net/view.php?comic=2972

Monday, 01 June 2015

This Is Fine - 03:58

Ads by Project Wonderful! Your ad could be here, right now.

Goddamnit Faye

I will be at NYCC Special Edition in New York this weekend! You should come say hi.

URL: http://questionablecontent.net/view.php?comic=2971

Thursday, 28 May 2015

People Of VanCAF - 05:53

Ads by Project Wonderful! Your ad could be here, right now.

I am on my way home from VanCAF! Here are some people I drew at the show.

URL: http://questionablecontent.net/view.php?comic=2969

Wednesday, 27 May 2015

What is PL research and how is it useful? - 19:00

If you are in the world of programming languages research, the announcement that UW had hired Ras Bodik away from Berkeley was big news. Quoting UW’s announcement:

Ras’s arrival creates a truly world-class programming languages group in UW CSE that crosses into systems, databases, security, architecture, and other areas. Ras joins recent hires Emina Torlak, ¹ Alvin Cheung, Xi Wang, and Zach Tatlock, and senior faculty members Dan Grossman and Mike Ernst.

And there’s also Luis Ceze, a regular publisher at PLDI, who ought to be considered as part of this group. With him, UW CSE has 8 out of 54 faculty with strong ties to PL. Hiring five PL-oriented faculty in three years, thus making PL a significant fraction of the faculty’s expertise, is (highly) atypical. What motivated UW CSE in its decision-making? I don’t know for sure, but I suspect they see that PL-oriented researchers are making huge inroads on important problems, bringing a useful perspective to unlock new results.

In this post, I argue why studying PL (for your PhD, Masters, or just for fun) can be interesting and rewarding, both because of what you will learn, and because of the increasing opportunities that are available, e.g., in terms of impactful research topics and funding for them.

What is PL Research?

When your hear that someone’s research area is programming languages, what do you think they do?

Some people I’ve talked to think that programming languages researchers work on, well, programming languages. They imagine someone developing a new esoteric, idiosyncratic programming language that few people (maybe just their hapless students) will use. Even if these people believe that developing new languages has merit, they still wonder whether such research is a waste of time because, frankly, aren’t existing languages good enough, and completely entrenched?

First, I should point out that not all academic/research-driven languages remain in the academic ivory tower. Scala, Haskell, Racket, ² and OCaml are at least four examples of academically driven languages that have serious use in the real world. (For more about PL adoption, see my previous posts on PL adoption research and reality.)

But the main point I want to make is that PL research is broader than designing and implementing new languages. To me, a PL researcher is someone who views the programming language as having a central place in solving computing problems. From this vantage point, PL researchers tend to focus on developing general abstractions, or building blocks, for solving problems, or classes of problems. PL research also considers software behavior in a rigorous and general way, e.g., to prove that (classes of) programs enjoy properties we want, and/or eschew properties we don’t. This approach has proven to be very valuable for solving a wide ranging set of problems.

Solutions via Simple, General-purpose Abstractions

The ethos of PL research is to not just find solutions to important problems, but to find the best expression of those solutions, typically in the form of a kind of language, language extension, library, program analysis, or transformation. The hope is for simple, understandable solutions that are also general: By being part of (or acting at the level of) a language, they apply to many (and many sorts of) programs, and possibly many sorts of problems.

As an example, consider probabilistic programming, which I’ve covered earlier on this blog. Here, the problem is general-purpose machine learning. A probabilistic programming language is a programming language extended with constructs for sampling from standard distributions (Gaussian, beta, Bernoulli, uniform, etc.) and observing (or assuming) possible outcomes. With these extensions, we can write programs that, under a standard semantics, act as a sampling functions for (non-standard) distributions. Then, we can apply a different semantics to a probabilistic program that interprets it as the distribution itself by allowing us to use Bayesian methods to learn its unknown parameters based on observations. Probabilistic programming is general-purpose, so it applies to many machine learning problems (and problems from other domains) while being simple and elegant.

There are many more examples of this approach. Here are two more.

Suppose I have a function F such that F(X) = Y. An incremental version of F would accept a small change to X with which it could update the output Y with little extra work (compared to running re-running F on the entire modified input). Researchers can, and do, design novel incremental algorithms directly. For example, this 100-page 2002 PhD dissertation defines an incremental algorithm for computing the convex hull of a set of points. A PL approach to incremental computation is exemplified by Adapton, ³ which defines a few simple language extensions and some run-time support, allows the programmer to write a normal-looking (non-incremental) Quickhull implementation and get an incremental version of it for free. Adapton is general-purpose so it can be used to incrementalize many algorithms, not just convex hull.

Our last example: An authenticated data structure (ADS) implements a form of verified computation, whereby an untrusted server can perform operations on the data structure and produce a proof with which a client can verify the operation was performed correctly. Cryptography researchers have proposed many such data structures, starting from (full, binary) Merkle trees, and progressing through red-black trees, B-trees, the Bitcoin blockchain, and more. Rather than treat each new ADS as a new research problem, the PL approach, exemplified by the LambdaAuth work previously considered on this blog, is to define a simple language extension with which you can program ADSs. The language metatheory ensures that the result produced by the LambdaAuth compiler is secure. And the language is general enough to be able to program the single-point results published previously.

And of course there are many other examples, from programming software-defined networks to information flow-secure software to compiler optimizations to program verification to program synthesis (e.g., per the multi-institution ExCape project). And of course I am only scratching the surface of the breadth of PL research generally, which consists of many exciting topics and directions.

Defining, and Understanding, Software Behavior

A key feature of making the above approach work is a deep knowledge of the semantics of programs. We have to be able to carefully state what programs do to be able to say that (e.g., when using a particular set of abstractions) programs do what we want and/or do not do what we do not want. Semantics is a decades-old topic, and still a rich area of research. But methods are mature enough to be applicable to most real-world problems while being easy to use.

A PL-minded approach to solving a problem often includes formulating the solution within a formal semantics (typically operational semantics ⁴) and then proving that all programs that employ the solution will behave as we would like them to. ⁵ For example, for probabilistic programming, Claret et al formulated the operational semantics of probabilistic programs as performing sampling, and then proved that their algorithm to infer the distribution denoted by that program matches the distribution of all possible (sampling) executions. For incremental computation, Hammer et al defined the semantics of normal execution in Adapton and proved that their algorithm for incremental update, embedded as a choice within that semantics, produces the same outcome as re-running from scratch. For authenticated data structures, Miller et al formulated the semantics of their language extension from the point of view of both server and client and proved that if the server attempts to lie about the result, the client will detect it with high probability. Importantly, these proofs apply to all (well-defined/well-typed) programs in the language, not to just particular algorithms.

The above examples target particular problems using extended languages and perhaps novel semantics. Another way to “understand software behavior” is to take a program in a standard language and prove a program-specific property mechanically, and ideally automatically, by using techniques such as static analysis, dynamic analysis, and formal mechanization (e.g., in Coq). Many of the success stories of PL, such as the ASTREE static analyzer for proving the absence of run-time errors in C programs and the SLAM model checker that is the core of Microsoft’s Static Driver Verifier, and the CompCert verified compiler, come from this tradition. Importantly, advances in ways to understand standard-language behavior cross-pollinate with ideas for developing better languages and language abstractions for addressing new problems.

PL+X is an effective combination

I’ve picked the three examples in the above discussion for a reason: They are an approach to solving general problems using PL-minded techniques in combination with techniques from other communities, like machine learning inference algorithms or cryptography. Returning to UW, several of the people listed in the announcement do not self-affiliate with the PL community, but rather with communities like Systems (Xi), Databases (Cheung), Software Engineering (Torlak), and Architecture (Ceze). But they point to PL as a key enabler of what they do. For example, Cheung states,

I work on tools that make use of programming language techniques to improve application performance.

And Xi states,

My research interests are in building secure and reliable systems. Past work: Jitk verified in-kernel interpreter, the Stack undefined behavior checker, the Retro intrusion recovery system, and the Resin language runtime for securing web applications.

Conversely, many “PL people” I know bring their PL-style mindset and expertise to new domains, with much success. For example, Dave Walker and Nate Foster have had a huge influence in the software-defined networks world via the Frenetic project. Andy Gordon and Johannes Borgström are leaders on applying probabilistic programming languages to machine learning with their Fun and Tabular languages. Umut Acar and Ozgur Sumer brought their expertise in incremental computation to solve open problems in adaptive Bayesian inference.

One notable success in bringing a PL perspective to an old problem is in the area of program synthesis, and Sumit Gulwani‘s FlashFill work, introduced in the paper Automating String Processing in Spreadsheets Using Input-Output Examples, from POPL’11. The problem of program synthesis has been around for a long time in the AI community (machine learning is essentially trying to synthesize a function using examples), and Sumit’s PL perspective was important to cracking it successfully. Sumit says:

With regards to comparison with AI: In AI, we typically perform heuristic based search. Whereas FlashFill is an instance of search based on deductive methods, where we systematically compute the set of all/many programs that are consistent with the examples that the user provided. The set of all such programs is the space over which ML based ranking techniques operate. I regard the deductive methods and the data structures for succinctly representing a huge set of programs (also called “version space algebras”) as PLish concepts.

FlashFill is now, of course, a feature of Excel. Sumit gave a nice talk about it, and his approach to the work, at PLMW’15.

Looking ahead

I hope I have convinced you by this point that PL research brings a very useful perspective to solving general problems. But if you study PL, or hire someone who does it, are their prospects really that good?

In terms of resources, I am struck by how much research funding is going toward problems that can readily incorporate a PL perspective. DARPA I20 has had a flurry of BAAs come out in the last few years for which PL should play, or is playing, an important role. I count at least 12 programs (APAC, CRASH, CSFV, HACMS, CFAR, CGC, STAC, BRASS, BRANDEIS, VET, PPAML, and MUSE) and probably there are more. These programs focus on problems in systems, software engineering, computer security, and machine learning. NSF also has a large funding profile for PL work, e.g., via SHF and SATC. And of course funding is available through AFOSR, ARL, IARPA, and many other agencies in the US and abroad.

In terms of jobs, I take the UW CSE hiring as a positive sign. Other great places regard PL highly as well. Microsoft Research’s (MSR) Redmond lab’s RiSE group is 30 researchers, out of 300 total there, and MSR Cambridge also has an incredible, and sizeable, PL group. CMU, Indiana, Northeastern, and UT Austin also have been actively growing the size of their PL groups (see Swarat’s ranking app). I did an informal poll of PL PhD graduates that were on the market this season, and people are taking faculty jobs (or have offers for such jobs) at CMU, Grinnell, Hendrix College, UIUC, Pomona, Princeton, University College London, University of Colorado (Boulder), and Wisconsin; and are taking industry jobs at Apple, Facebook (actively using PL research), Google, Intel Labs, Microsoft, Oracle, and Samsung. People are also taking post-doctoral positions at places like Imperial College London, National University of Cordoba, Northwestern, UCSD, the University of Cambridge, and the University of Maryland.

Today, there are many exciting and important problems in the public eye, from cybersecurity to data science to quantum computing to green computing to health informatics. PL research can bring, and is bringing, a great perspective and skillset to these problems, unlocking groundbreaking results. Check it out!

Notes:

We previously interviewed Emina on this blog. ↩
Racket started as a dialect of Scheme, and from this point of view, it has enjoyed long-running real-world use. ↩
General-purpose incremental computation has a long history; Umut Acar‘s work on self-adjusting computation is notable in bringing it to maturity. ↩
Operational semantics is the most popular choice, but other forms of semantics, like denotational and axiomatic semantics, are also used. ↩
This approach boils down to the art of formalization, with proofs by structural induction (i.e., not just on mathematical numbers, but on inductively defined structures like syntax trees and proof derivations). This characterization is the neat summary of Sumit Gulwani; his recent CACM article also shows how PL-style thinking can be broadened outside of computer science/programming problems, e.g., to education and computational thinking. ↩

The post What is PL research and how is it useful? appeared first on The Programming Languages Enthusiast.

URL: http://www.pl-enthusiast.net/2015/05/27/what-is-pl-research-and-how-is-it-useful/

Back To The Future You - 05:58

Ads by Project Wonderful! Your ad could be here, right now.

Hannelore has a beautiful mind

URL: http://questionablecontent.net/view.php?comic=2968

Tuesday, 26 May 2015

As Seen On TV - 06:01

Ads by Project Wonderful! Your ad could be here, right now.

Seems plausible

URL: http://questionablecontent.net/view.php?comic=2967

Monday, 25 May 2015

Adventures In Hypochondria - 06:50

Ads by Project Wonderful! Your ad could be here, right now.

me.png

VanCAF was amazing and I am very tired good night

URL: http://questionablecontent.net/view.php?comic=2966

Friday, 22 May 2015

Cryosuffering - 02:16

Ads by Project Wonderful! Your ad could be here, right now.

insert lupus joke here

VanCAF is this weekend! I will be there! You should come say hi!

URL: http://questionablecontent.net/view.php?comic=2965

Thursday, 21 May 2015

Always Buckle Up - 02:49

Ads by Project Wonderful! Your ad could be here, right now.

VanCAF is this weekend! I will be there! You should come say hi!

URL: http://questionablecontent.net/view.php?comic=2964

Wednesday, 20 May 2015

Diagnosis: Murder - 02:59

Ads by Project Wonderful! Your ad could be here, right now.

VanCAF is this weekend! I will be there! You should come say hi!

URL: http://questionablecontent.net/view.php?comic=2963

Tuesday, 19 May 2015

Confirmation Of Fears - 01:32

Ads by Project Wonderful! Your ad could be here, right now.

Never, ever look up your symptoms on the internet.

VanCAF is this weekend! I will be there! You should come say hi!

URL: http://questionablecontent.net/view.php?comic=2962

Monday, 18 May 2015

It's Not A Tumor - 02:05

Ads by Project Wonderful! Your ad could be here, right now.

This is not a storyline about cancer.

URL: http://questionablecontent.net/view.php?comic=2961

Friday, 15 May 2015

Inconvenient Truths - 03:33

Ads by Project Wonderful! Your ad could be here, right now.

PROLAPSE WEEKLY was cancelled after the first month and its editor was put in jail

URL: http://questionablecontent.net/view.php?comic=2960

Thursday, 14 May 2015

Beddy-Bye - 02:32

Ads by Project Wonderful! Your ad could be here, right now.

sleep tight!

URL: http://questionablecontent.net/view.php?comic=2959

Wednesday, 13 May 2015

Heat Dissipation - 02:13

Ads by Project Wonderful! Your ad could be here, right now.

fweeee

URL: http://questionablecontent.net/view.php?comic=2958

Tuesday, 12 May 2015

Culinary Tradition - 04:21

Ads by Project Wonderful! Your ad could be here, right now.

Whew! Back from TCAF much, much later than I was expecting. See you tomorrow.

URL: http://questionablecontent.net/view.php?comic=2957

Friday, 08 May 2015

Proclivities - 02:29

Ads by Project Wonderful! Your ad could be here, right now.

I don't know who the big yellow guy is but he's a cutie!

I will be at TCAF this weekend! You should totally come say hi!

I have a QC thing for sale on eBay!

URL: http://questionablecontent.net/view.php?comic=2956

Thursday, 07 May 2015

Adequate Support - 02:40

Ads by Project Wonderful! Your ad could be here, right now.

I will be at TCAF this weekend! You should totally come say hi!

URL: http://questionablecontent.net/view.php?comic=2955

Wednesday, 06 May 2015

A Nice Stroll - 01:25

Ads by Project Wonderful! Your ad could be here, right now.

I will be at TCAF this weekend! You should totally come say hi!

URL: http://questionablecontent.net/view.php?comic=2954

Tuesday, 05 May 2015

Follow Along - 01:03

Ads by Project Wonderful! Your ad could be here, right now.

I will be at TCAF this weekend! You should totally come say hi!

URL: http://questionablecontent.net/view.php?comic=2953

Monday, 04 May 2015

Daily Life - 03:16

Ads by Project Wonderful! Your ad could be here, right now.

Robots!

I will be at TCAF this weekend! You should totally come say hi!

URL: http://questionablecontent.net/view.php?comic=2952

Friday, 01 May 2015

To-Do List - 02:12

Ads by Project Wonderful! Your ad could be here, right now.

Sorry for the simple comic, I had a bad anxiety attack today and this was the best I could do.

URL: http://questionablecontent.net/view.php?comic=2951

Thursday, 30 April 2015

Ranking CS Departments by Publication Productivity, Interactively - 15:12

Anyone familiar with American academia will tell you that the US News rankings of academic programs play an outsized role in this world. Among other things, US News ranks graduate programs of computer science, by their strength in the field at large as well as certain specialties. One of these specialties is Programming Languages, the focus of this blog.

The US News rankings are based solely on surveys. Department heads and directors of graduate studies at a couple of hundred universities are asked to assign numerical scores to graduate programs. Departments are ranked by the average score that they receive.

It’s easy to see that much can go wrong with such a methodology. A reputation-based ranking system is an election, and elections are meaningful only when their voters are well-informed. The worry here is that respondents are not necessarily qualified to rank programs in research areas that are not their own. Also, it is plausible that respondents would give higher scores to departments that have high overall prestige or that they are personally familiar with.

In this post, I propose using publication metrics as an input to a well-informed ranking process. Rather than propose a one-size-fits-all ranking, I provide a web application to allow users to compute their own rankings. This approach has limitations, which I discuss in detail, but I believe it’s a reasonable start to a better system.

Do we need rankings at all?

Many would argue that the idea of department rankings is inherently problematic. Any system of comparison must evaluate a department’s research quality using pithy metrics. To do this to a creative activity like research is reductionist and possibly harmful.

A glib rebuttal to this is that the genie is already out of the bottle. The marketplace has spoken, and it clearly likes university rankings. By not starting a conversation about a better system of ranking than US News’s, we reward the status quo.

A better justification is that department rankings are a valuable service to prospective students. Matching prospective graduate students to programs is a problem of resource allocation in a market. However, this market has information asymmetry, because students don’t have a clear idea of what makes for a good Ph.D. experience. When courting prospective students, universities put on shows that have limited connection to the reality of the graduate research experience. The problem is worse for international students, who frequently join universities sight unseen. As a result, it is easy for students to make suboptimal choices when selecting a program to join. At their best, university rankings help students make more informed decisions.

Ranking by objective metrics

What would a fairer system for ranking computer science departments look like? It seems to me that any such system should depend, in part, on real data on research productivity. The problem, of course, is that “research productivity” is a fuzzy concept. Efforts to approximate it using “bean counting” measures like paper or citation counts, or grant dollars, have basic shortcomings.

However, I think that these approximations have some value, especially when seen from the point of view of the end users of department rankings. Presumably, a prospective student would want to have a strong CV at the point when she finishes her Ph.D. She has a higher chance of doing so if the group she joins publishes regularly at top-tier publication venues in the areas in which she is interested. She is more likely to stay funded if her advisor has a track record of bringing in grant money. She is more likely to do highly cited research if her advisor has more highly cited papers than others in the same research area and level of seniority.

All in all, objective metrics have limitations, but also produce some useful signals. One could, at the least, make them a factor in computing department rankings, even using a reputation-based system. For example, survey respondents in a reputation-based rankings could choose (or be asked) to use productivity-based rankings as an input in their decision-making process. Presumably, this would lead respondents to make more informed judgments.

Interactive ranking: putting the user in charge

Another issue with existing ranking systems is that they are static, one-size-fits-all solutions. Think of a prospective student who is interested in the interface of Programming Languages (PL) and Machine Learning (ML). He should probably pick a department that has been active in PL and ML in recent times, and an advisor who has a strong track record in at least one of these areas. Depending on his interests, he might want an advisor who is primarily a PL researcher but also collaborates with ML folks, or the other way around. Finally, he may want to work with professors who are at a certain level of seniority, or whose students and postdocs have gotten high-profile research jobs. Unfortunately, current ranking systems do not support such nuanced decision-making.

Maybe what is needed, then, is a rankings application that can be customized to different user needs. For example, such an app could let a user assign weights to the different subareas of computer science, and rank departments and advisors by their weighted strength in these areas. The system would be fundamentally interactive: users would be able to change the weights on various variables and observe changes to the ranking results.

An interactive ranking system based on publication productivity

Over the last few weeks, I have coded up the first draft of such a ranking app. The rankings this app computes are based on a single objective metric: publication counts at top-quality outlets. The reason why I only used this metric is simple: data on which researcher publishes where is available from the DBLP bibliography database, and this data can be used to compute paper counts. On the other hand, I had no easy access to data on citations or funding or the history of a department’s former students. I believe the ranking system is defensible, as acceptance at top venues correlates, at least to some extent, with quality of research. However, by definition, it considers one dimension of a complex, multidimensional space. One might extend the app to allow ranking using a richer set of features.

DBLP doesn’t track institutions of researchers, but Alexandra Papoutsaki and her coauthors have recently developed a listing of faculty members at 50 top US universities. By cross-linking this dataset with DBLP data, one can determine where professors in a given department publish. ¹

I used feedback from colleagues and friends to select a few top-tier publication venues in several subareas of computer science (see here for more details). For example, the top venues in Programming Languages (PL) are the Symposium on Principles of Programming Languages (POPL), the Symposium on Programming Language Design and Implementation (PLDI), the Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), and the International Conference on Functional Programming (ICFP). The top venues for Algorithms and Complexity are the Symposium on Theory of Computing (STOC), the Conference on Foundations of Computer Science (FOCS), and the Symposium on Discrete Algorithms (SODA).

The application’s interface allows the user to select a time window within the last 15 years and to put weights on various areas. The app assigns a score to each professor by giving him or her w points (w is a number between 0 and 1) for each paper in a top-tier venue in an area with weight w. We also identify a set of relevant professors — intuitively, the set of professors who are prospective advisors in the areas of interest. To qualify as a relevant professor, a faculty member must have published 3 or more papers across top venues in areas where the user has put a nonzero weight, within the selected period. Departments are now ranked according to three different metrics.

1) Aggregate productivity. In this measure, the score of a department is the sum of the scores for all its professors. A department that scores high on this measure is likely to be a high-energy research environment, with a culture of publication in strong conferences in the areas of interest. However, this metric is likely to favor larger departments over smaller ones.

2) Maximal productivity. Here, the score of a department is the greatest score received by one of its professors. Unlike aggregate productivity, this metric is not directly affected by a department’s size. A justification for this measure is that at the end, a prospective student needs only one advisor. Consequently, joining a small department with one prolific researcher can possibly beat joining a larger department with multiple less productive researchers.

3) Group size. This metric estimates the size of a department’s group in the students’ areas of interest, by counting the number of relevant professors that it employs. This statistic puts the productivity rankings in perspective. Arguably, it is also of intrinsic interest to prospective students. Larger groups allow more courses, seminars, and interactions with fellow-students. On the other hand, some students prefer the cosiness of small departments, which are likely to score poorly on this metric.

Results

The purpose of this application is to allow interactive exploration of data, and different users will draw different conclusions from this exploration. So, rather than present any results, I invite you to use the application yourself!

Limitations and conclusion

As mentioned earlier, this ranking application is meant to be a first draft rather than the final word. Given that publication in selective venues is key to success as a researcher, I believe that the app produces some useful information. However, any ranking based on objective metrics has limitations, and this system is even more limited in considering just one measure.

The app has some implementation issues as well. The roster of faculty members used here was generated through crowdsourcing and is not guaranteed to be free of bugs. Also, linking faculty members in the roster to DBLP records isn’t always easy: a professor’s DBLP entry may use a name that is slightly different from their name in the roster, and some professors appear under multiple names in DBLP. I used a heuristic to join the two data sets while correcting for common variations of names, and also manually examined the data, but this is hardly a failsafe process.

However, I am hoping that the wisdom of the crowd can be used to overcome some of these limitations. The code and the data for the app are freely available (see here for more details). I have created a public Google document to collect information about errors in the data; please leave a comment there if you find bugs. You are also welcome to extend the app with additional features and productivity metrics.

Going beyond this particular app, I think we need to start a conversation about how to help prospective students make better decisions about where to go for graduate school. For a long time, we have allowed entities who do not have a stake in our discipline to rank our graduate programs. These rankings have dubious methodology, and they also have real implications for our departments. At the same time, we must recognize that they fill a real need. Creating nuanced, data-driven approaches to school ranking is a constructive way of challenging their hegemony.

[This post has been updated since it was first published.]

Notes:

Seddighin and Hajiaghayi have recently developed a ranking of departments by publication productivity in Theoretical Computer Science. Also, Remzi Arpaci-Dusseau developed a ranking of systems researchers by productivity a couple of years ago. ↩

The post Ranking CS Departments by Publication Productivity, Interactively appeared first on The Programming Languages Enthusiast.

URL: http://www.pl-enthusiast.net/2015/04/30/ranking-cs-departments-interactively/

Bonus: 2015 in review … and a farewell - 12:00

I know this is premature, considering as we’re only halfway through the year. But May marks the 30th month of this blog, and even though I don’t have statistics, I suspect that there were an unusually high number of tip-top terminal titles in that short span.

So as a final nod to truly great software, I’m going back through anything from the first half of this year, and maybe even creeping a little into the last days of December. Let’s look one more time at the stuff that really shines.

Out of respect for the 2014 and 2013 best-of’s, I’ll lump this together in a crazy, scrambled, unjustified jumble. Hey, you get what you pay for. … ;)

One word: Geekbait.

Best geek detector: horst. The first gold-star-winner of 2015 was a geekplosion going by the name of horst. In the grand cornucopia of wireless monitors and network tools, horst is a thoroughbred with plenty of visual yummies to keep you entertained. I know next to nothing about wireless technology, except to start up a network manager in hopes of reaching my e-mail. I started up horst once — just once, mind you — and every geek within 25 meters was suddenly and inexplicably pulled directly to my laptop screen. horst may be a great way to get more information about the invisible networks that float around us all, or it may be a good way to meet the geek of your dreams … if you are not already the geek of your dreams, that is. :roll:

Which disk has Jumpman on it?

Best archiving tool for your media collection of 25 years ago: fdd. It’s probably not fair to pick at fdd for being a good decade or two beyond its intended audience, and given that it has provisions for CD and other removable media, it’s still useful in this day and age. All the same, it is obviously intended for a different generation, when keeping track of floppies was a tedious chore. With lots of options for searching and naming, fdd wins a tip of the straw skimmer in recognition of a good design regardless of its Precambrian origins. You laugh, but one day you’ll be scraping around the Internet trying to remember the name of this program, so you can use it to catalog your overloaded USB stick collection. …

Finally, an editor that doesn’t treat me like an idiot.

Best text editor that isn’t vim or emacs: wpe and we. I make a big deal out of text editors that aren’t the twin cancers of the console, vim and emacs. Picking between those two is like asking if you’d rather be hit between the eyes with a ball-peen hammer or with a cricket bat: Neither one is a reasonable choice for a rational computer-using human-type person. And the third-place finisher, nano, is so far behind that it’s really just a timid, simpering speck on the horizon. For an editor with real character, wpe and it’s non-programmer version, we, are sheer genius. Built-in file selectors, popup menus, a Van Gogh-ish palette of color, menu bars, cascading and tiling windows, syntax highlighting and sensible defaults — anything you could want in a text editor without having to suffer through the mind-numbing triple-key combos of emacs, or the asinine, feverish mode-swapping of vim. Yes, that’s right, I just called your editor mind-numbing and asinine. You got problem with that? :evil: I’m getting bold in my last days. … :shock:

Thanks for making the rest of us look bad.

Rampant overachiever of 2015: httping. I was sorely tempted to throw httping to the top of the list, as a technological salt lick for free-ranging geeks in your local environment. But httping gets its own bracket, and wins the judges’ choice for the most hopelessly overdone title of 2015 — especially since it is, at its core, a rewrite of the omnipresent ping utility. Some ping remakes are content to emit beeps, or bounce like EKGs, or just show a little more information than the classic tool. But that’s not enough for httping, which was obviously exposed to a massive overdose of gamma radiation, because it comes out of the chute like The Hulk jacked up on methamphetamines. httping gives you every possible modulation on ping you can imagine, plus some you only saw in your nightmares. Look, everybody wants to be the ping, but this is just ridiculous. :shock:

Welcome back, Kotter.

Best do-it-yourself office suite: mined, ol, alot, scim and tapecalc. I’m lumping titles here, which probably isn’t fair, but if you wanted to set up an entire suite of software to manage your home office on a leftover K6-III, you got five grand programs just in the first half of this year. tapecalc is a majestic redesign of the office adding machine archetype, suitable for tabulation, accounting and even just simple math, and won’t run up your budget by burning through that awful thermal printer paper. mined is another prime example of why both vim and emacs should be completely discarded and all their prophets exiled to a very cold place, and a magnificent example of how easy text-based editing should be.

alot won out over multimail as an easy-to-learn, easy-to-manage and easy-to-read mail agent, even if the supporting software will be a bear to set up for some people (myself included). ol is the baby of the crowd, proving its worth just days ago as a to-do list manager and hierarchical note taker, even if it doesn’t have all the flash of some others. And scim wins the seat of honor in the text-based office suite, for bringing the decades-old spreadsheet sc back to modernity, and adding so many desperately needed features that you’ll think it’s Christmas. Even if you weren’t around for all 2 1/2 years of this blog, look at those five programs and you’ll have enough to keep your K6-III busy. (Hey, don’t laugh: Those old Super Socket 7’s are regaining their value, now that they’re vintage technology. …)

How is that even possible?!

Best sights and sounds: shpaint and mpfc. It was probably easy to see coming, but shpaint was a winner in the visual arts department for making terminal emulator art as easy as point-and-click. I don’t know what sorcery makes it possible to do that from bash, of all places, but shpaint made me all misty-eyed, remembering the good old days of Paint Now! (Massive bonus points if you know that one.) Granted, it didn’t really stand up to some of the higher features of other graphics programs, but all told, it was the best artistic invention for the year.

And speaking of artistic endeavors, mpfc overshadowed any and all music players for the console in the last six months, by adopting an easy interface, great color, good sound and a lightweight footprint. I felt a little bit guilty taking mpfc for a spin, and then returning to moc: Test-driving it was like hanging out with your best friend’s SO for a day of fun, and then heading home to eat dinner with your family. :???: Perhaps in another life, we could have found happiness together. … :'(

Save this for Sept. 19.

Best tool for looking at pictures of cats: rtv. The reddit army can stop harassing me for not thinking too highly of their stomping ground, and take solace in the fact that rtv is probably one of the nicest console-based forum browsers dedicated to a single web site. Good color, an arrangement that mimics the visual style of the site, one-key commands for the most common actions, and account management for ease of startup. I’m almost sad that it really only works with one site; a tool like this that could rip through forums and BBSs would be a godsend. I may not be a fan of reddit, but using rtv could be reason enough to visit it. Now stop e-mailing me to tell me your favorite reddit hangouts. >:(

Where shall we go this time, my little robot buddy?

Best game: scrap. I pulled out a lot of hair before I made this decision. For as brief a time as we had together in 2015, a lot — and I mean a lot — of that time was spent on games. Some of them were magnificent too: Frozen Depths aligned itself with the topmost rung of roguelikes, making it a peer of things like Cataclysm DDA and Dungeon Crawl. nsuds was so well made that it could have been a tutorial on how to design a text-based console game. And if you didn’t know better, net-o-grama could trick you into thinking it was a graphical game. Even hack-of-life managed to astound me, and that game was built on the ancient carcass of a decades-old theoretical population simulation. It was a good six months for games, no matter how you looked at it.

In the end though, it was the simple novelty of scrap that won me over. I was still puttering around with scrap for days after I had written about it, in between the other games that I was skimming through. It’s not a perfect game (what is?) and it is at risk of disappearing into the technological ether of history, if archive.org ever jettisons it. But for sheer fun, a simple but challenging premise and a colorful interface, scrap won the crown. Please accept my heartiest endorsement.

And that’s it. That’s all I have to say. I’ve come to the end of all the lists and wikis, indexes and repositories, collections and top-10-lists. :(

Now I’m afraid our time together must come to a close.

I don’t have any words of wisdom beyond what I left at the top of the old blog. We had a good decade, and I hope the past 30 months of silly little text-only adventures were as interesting — and maybe educational — for you as they were for me.

Keep trying new things, keep looking for simpler solutions, and keep using the programs you like best. And remember: Be kind to one another. We’re all we’ve got!

K.Mandla :)

URL: https://inconsolation.wordpress.com/2015/04/30/bonus-2015-in-review-and-closure/

Or Fleas! - 02:09

Ads by Project Wonderful! Your ad could be here, right now.

Hisss sssss

this is snek

URL: http://questionablecontent.net/view.php?comic=2950

Wednesday, 29 April 2015

Bonus: A going-out-of-business sale - 12:00

Today and only today, a special sale on leftover software titles. :D Some of what you find here today might not work, and in some cases it didn’t work for me either.

On the other hand, some just required too much time to set up, and so might be perfectly functional — in fact, it might be just what you always wanted.

I recommend you try everything and if it fits, take it home with you. ;)

birdsay: A nifty little gimmick, this displays a tweet as an ASCII bird, which only makes sense. I expect this will require a Twitter account though, and I don’t have one, so I leave it to you to explore.
btail: A Bayesian log filter, which proved a little more homework than I could appreciate. Functional, just a little more complex than I had time for.
clerk: An mpd client, which I believe behaves a lot like dmenu. I hold no grudge against mpd, except that it usually takes a long time for me to set up, and so I tend to procrastinate with mpd-based software. If you use mpd, it might be worth trying.
clif: clif intends to make gifs from other sources, all at the CLI. My initial efforts to get it running were unsuccessful; see if you have better luck.
content-screenshot: I believe this relies on Chromium, so I’ve omitted it as a graphical program.
deco: A text-based two-pane file manager patterned after Norton Commander. This wouldn’t build for me, and I see on the home page that it has been deprecated in favor of another file manager. I’m afraid the code is just too old.
demlo: A software structure for tagging and organizing music. I’ve had this on my list for quite a while, but I ran out of steam with tagging utilities way back in December. And then I just plain ran out of time. :(
dired: I suspect there are two dired’s: an old one that runs within emacs, and a newer one that works independently. My confusion over the name and my difficulties getting the old one to work kept knocking this file manager down my priority list, and now I hand it off to you.
gsync: A tool to synchronize folders against Google Drive. I could build and run this, but it never synced against my account. I suspect Google has updated its protocol to the point where gsync can no longer communicate with the cloud, as sometimes happens.
ImageScraper: An image downloader. A submission via e-mail, this never worked for me as promised, even when I buckled and tried to install it with pip.
img: An image uploader specific to imgur.com. To be honest, imgur’s pages are always terribly heavy and laden with gimmicks that make it slow to navigate on a 12-year-old computer, so I never bothered trying it. Plus, I’ve outlived three or four third-party image hosts over the past decade, so I don’t have much need for imgur.
imgbash.sh: I believe this is imgur.com’s in-house upload script, which may or may not be better than the previous title. Again, I leave that to you to discover.
marrie: Martin sent me a link to marrie just yesterday, no doubt in response to the deluge of podcast downloaders of last week. You sent it just in under the wire, Martin, but you left me no time to try it … :???:
mine: A console interface for Dropbox, which could be quite useful if you rely on Dropbox for cloud storage. I think you can actually run software from the cloud too. :| I didn’t get the chance to try this at all, but I do have a Dropbox account … somewhere. … :???:
mu4e: Another mail agent for emacs. The next blog will be “How to take care of all your computing needs from within emacs.” Just kidding. :roll: If I remember right, there’s a standalone version (in other words, without the need for emacs) just called “mu.” And now, little man, I give the watch to you.
rhype: A music player specific to Hype Machine. I did due diligence with this, but it came up lacking. Lazy responses from the daemon, almost no control over the actual player doing the work, and very sluggish behavior when trying to access the playlist. See if you have better luck though.

And there it is: The end of the end. Everything I know and ever got, whether through my own efforts or through he kindness of contributors, is now somewhere on this site. All the 400 original tools and the 2000 or so that followed are public record. A treasure trove, for generations to come. :'(

Tomorrow, one last look at the first half of 2015.

URL: https://inconsolation.wordpress.com/2015/04/29/bonus-a-going-out-of-business-sale/

Is That A Portrait? - 02:09

Ads by Project Wonderful! Your ad could be here, right now.

butts butts butts

URL: http://questionablecontent.net/view.php?comic=2949

Tuesday, 28 April 2015

ndn: And memories of the past - 12:00

About a month ago, StreaK left a note about ndn, a/k/a Necromancer’s DOS Navigator. I might catch some flak for this, but I saved it for last, before the shop closes down.

Introductions first. ndn has a precompiled Linux version that you see there, and as you might expect from the screenshots, I find it quite enjoyable. You can see the two-panel menu-based approach that hearkens back to Midnight Commander and its predecessor, Norton Commander.

ndn takes that style and adds multiple window support, a healthy smattering of disk functions (like renaming a volume, like you see above), separate histories for command line and panel activity, and a lot of other tricks. Most everything is driven by a popup dialogue, which I always appreciate.

Even more amazing, ndn adds in a calculator, a calendar, a phone book (remember, this has history back to the modem era), and a lot more. If you try ndn, take your time with the “Utilities” menu, but don’t be shy about the others either.

Much of what ndn can do is arranged by submenu, and that makes it mostly easy to follow. There are also provisions for user-defined menus, and a special function menu just for Linux tasks.

With all that going for it, I still have a few small problems to report. First, as you can probably see if you look close at the screenshots, I’m getting some screen artifacts when I run the precompiled version in Arch. I don’t blame anyone for that; it’s a distraction, but it’s just the nature of the beast.

Second, there are some obvious leftover “tools” and “menus” from the DOS era. The letter strip you see along the bottom of each panel clearly corresponds to drive labels under Windows, and double-clicking on one in Linux just drops you into /dev/, which doesn’t really help.

It would be nice if that area instead showed mountpoints or even polled devices like sda1, sdb1, sr0 and so forth. As it is, that strip is wasted space. Add to wasted space: Displaying file names in the same color as the background for certain file types. That’s an unfortunate oversight. :\

ndn is also a very powerful suite of tools, and will take a lot of time to learn all the commands. But to complicate things, it appears there are different controls when running under X and when in a virtual console. Keyboard navigation seems disabled under X, meaning you have to do all your selections and menu action with the mouse. On the other hand, gpm was no help in a console.

Those are the main reasons I was hesitant to include ndn in this list: It seems a lot of its functions will hinge on graphical support. Which makes it not much better than undistract-me, which we talked about a few days ago.

And why save it for last, you ask? Nostalgia, mostly. If you remember the text-only era and the parade of file managers and disk checkers that all ran within consoles and had no graphical support, ndn is like a trip down memory lane. Popup menus drawn and shaded in hashed boxes, menu bars triggered with ALT+letters, and F-row hotkeys will take you back to the software of 30 years ago … all the way down to the color scheme. It’s a good feeling.

So if you take issue with a leftover DOS file manager, overstocked with utilities and rather haphazardly converted to Linux, appearing here at all, that’s my excuse: It reminds me of other times. :)

Tagged: file, manager

URL: https://inconsolation.wordpress.com/2015/04/28/ndn-and-memories-of-the-past/

Psychic Readings - 02:22

Ads by Project Wonderful! Your ad could be here, right now.

URL: http://questionablecontent.net/view.php?comic=2948

Monday, 27 April 2015

ol: One last editor and outliner - 12:00

A long time ago, I used to keep notes and lists with a normal, everyday text editor, and just draw up a outline format if I needed to show some sort of structure or to-do checkboxes.

Sometime around 2008 or 2009, I found two new applications that quickly took over those roles — one was vimwiki, which I may only need for another day or two, and hnb. hnb was on my system, in spite of its age, for a long five years until I found tudu last summer.

hnb and tudu (and the emacs and vim plugins that do much the same thing) are not the only hierarchical note-takers available. You can add ol to that list, with my endorsement … whatever that’s worth. :roll:

ol stands for “Outliner Lighto,” if the home page is to be believed. And what you see in that image is probably the best snapshot of what it does and how it works.

Arrow keys will navigate through a tree, and leaf nodes expand when you navigate through them. Press “d” to delete a note and all its children, Enter to edit a note, “t” to convert it to a checkbox for to-do lists, “x” to mark a task as done, and so forth. The empty file startup screen will give you help and instructions, if you need it.

Probably one of my favorite things about ol is the cut-and-paste action, or better called the “grab” function. Press “g” and you carry a note with you through the tree, allowing you to arrange and rearrange to your heart’s desire.

Since the display effectively updates as you navigate, it’s a lot easier to organize and visualize than the traditional cut-and-paste model. I like that a lot more than tudu’s way, which borrows yank-and-paste style of vim. And you know how I feel about vim. :evil:

ol is also colorful, even going so far as to assign colors to note depths, which is another wise evolution. hnb and tudu haven’t picked up that idea yet, and it’s one that is probably worth adopting.

I see that ol is written in pascal, which strikes me as unusual, but also completely irrelevant to using the program. As an added bonus, if you’re an hnb user and decide to use this moment to emigrate, there’s a utility that will convert hnb’s text format to a style ol can use.

ol doesn’t have some of the more detailed functions of tudu — like displaying a percentage complete for to-do lists, or allowing extended notes, deadlines and schedule dates. ol probably won’t dethrone tudu for me, for those reasons. I find with tudu that I can rely less on wyrd now too.

That alone is no reason to deny ol the gold star it deserves — for a clean interface, plenty of startup help, easy controls and a few innovative ideas for hierarchical list tools. Don’t spend it all in one place: :star: Enjoy! :)

Tagged: manager, note, organizer, task, to-do

URL: https://inconsolation.wordpress.com/2015/04/27/ol-one-last-editor-and-outliner/

Let's Recap - 02:37

Ads by Project Wonderful! Your ad could be here, right now.

Well well, look who it is

URL: http://questionablecontent.net/view.php?comic=2947

Sunday, 26 April 2015

since: Since you last checked - 14:00

I promised I would get since onto these pages before the end comes, mostly because I don’t remember any other log viewer that has this behavior by default … and I want to be able to remember it in the future.

It’s hard for me to be sure though, after so many years and so many log utilities. :\

since seems different because, as you might have inferred from the screenshot, it only displays log data since the last time it checked. So you can see the last portions of pacman.log at the top of that image, then the repository update. The next invocation of since only shows the two lines that had been added.

I’m sure other tail-esque tools can do this, and possibly add a few nifty tricks in passing. It’s just a matter of finding the right flags and getting them in order.

For its own part, since keeps its state file in ~/.since, and you have the option to ignore it. You can also tell since to use a special state file, to run periodically, to ignore compressed logs, ignore missing logs, and a lot of other options.

I am not a real bloodhound when it comes to keeping an eye on logs, so at its best, since is useful … but only rarely. On my pseudo-desktop system, there’s almost no call for it.

On a more complex system or in a situation where log files are critical, it might save you some time trying to get the information you need. I’m willing to give it a thumbs-up. :)

Tagged: information, log, system, tail, viewer

URL: https://inconsolation.wordpress.com/2015/04/26/since-since-you-last-checked/

Saturday, 25 April 2015

rfc and httpdoc: Two terminal references - 15:15

I have a couple of simple but related tools today, both from the same author. At left is rfc, and at right is httpdoc.

I’ve known about rfc for a while, but got a reminder about httpdoc earlier this week via e-mail. Since they both have the same style and same creator, it makes sense to lump them together.

rfc, when supplied with a number or a topic line, will pull the text of that RFC from the web and dump it into your $PAGER. No fancy formatting, no color-coded document histories, just one-shot quick access to RFCs all the way back to … well, back to number 1.

The home page has a three-step process for “installing” rfc into your $HOME directory, although I daresay it could be rearranged to allow for more than just one person to use. In any case, it takes very little effort and rfc itself won’t bog down your system, seeing as it’s just a bash creation.

As an added bonus, rfc will keep its documents stored locally, so you don’t have to re-download a request. If you rely on rfc frequently, you’ll probably be interested in some of the built-in actions — like update or list, which give rfc a little more oomph, and search, which … well, you should be able to figure that one out. :roll:

httpdoc is similar, in a way. As you can see above, httpdoc becomes an offline reference tool for HTTP documentation. In the screenshot above, I only showed the 404 status code, but httpdoc can also return documentation on header fields, if you need that.

I can see where httpdoc is still being updated even in the past few days, so I expect there will be more references to come.

httpdoc is written in go, so you’ll need that installed before it will play along. There are also some environment variables that you’ll want to adjust before using it, but it’s nothing complicated.

Both of these tools might strike you as too simple to be noteworthy, but that will depend a lot on your perspective. I use things like dict on a daily basis, and even have it hot-wired for thesaurus entries as part of my .bashrc.

If you have a similar need for RFC or HTTP documentation at the command line, then you might find both of these install-worthy. Necessity is the mother of invention. Or is it the other way around … ? ;)

Tagged: documentation, html, http, reference, references

URL: https://inconsolation.wordpress.com/2015/04/25/rfc-and-httpdoc-two-terminal-references/

Friday, 24 April 2015

undistract-me: Quashing your ADHD - 21:00

I fear this little utility might be usable only for a discrete set of fans. Technically speaking it’s a text-based application, but … well, I’ll let you take a look and see what you think.

In principle, it’s rather simple: undistract-me simply takes note if a shell command takes longer than 10 seconds to execute. If it does, it waits until the program finishes, then throws up the alert message you see in the screenshot above. Kind of cool, in an odd way.

Strictly speaking though, you’ll need all the underpinnings of a graphical desktop, plus whatever alert system is in use there, before you’ll get close to that kind of behavior. On my semi-graphical Arch system with just Openbox, I ended up adding gtk3, polkit, dconf, json-glib and a mess of themes and libraries before the git version was close to running.

So I don’t know if I’m being fair by including it. Don’t expect to suddenly plop this into place on your 400Mhz Celeron running screen, because you’re going to need a lot more to get close.

I won’t deny that I like the idea though, and if something comparable could be implemented in a text only environment, it might be worth trying. For my own part, I used to append long-running commands with aplay yoo-hoo.ogg so I would get an audible when something finished.

So in that way, I can sympathize. But unless you use a lot of terminal commands on a Linux Mint desktop and need some sort of blinky reminder when one finishes … well, like I said, it will probably only appeal to a slim range of fans. :\

Tagged: command, desktop, finish, notify

URL: https://inconsolation.wordpress.com/2015/04/24/undistract-me-quashing-your-adhd/

Grow Your Own - 02:38

Ads by Project Wonderful! Your ad could be here, right now.

Amelia is pretty chill

URL: http://questionablecontent.net/view.php?comic=2946

Thursday, 23 April 2015

rmlint: The potential to purge - 21:32

I am behind the power curve today, because of some real-life obligations. I am going to grab something quick and easy so as not to fall behind; things are going to be even busier into the weekend.

This is a snapshot of rmlint in action:

rmlint cruises through your directory tree, and offers to remove files it thinks are duplicates, zero-byte or hollow folders. And as you can see, given enough space, it will take its time to come up with a menu of potential excisions.

I did say “offers,” which is an important distinction. rmlint’s output is a little unexpected: By default it generates a script that will remove suspicious targets, but doesn’t outright eliminate them.

Which is a good thing; it means you have the option to adjust its list, and it also means you take the next step — running the script — with conscious aforethought. You can’t blame rmlint for whacking apart your /usr/lib folder if you told it specifically to do it.

I like rmlint for a number of reasons — it’s rare that I see a system cleaner, it takes the safe route by creating the script, and it has a smidgen of color.

But that’s about all the time I have today; I’ll get deeper into something tomorrow … I promise. ;)

Tagged: cleaner, directory, duplicate, file, folder, system

URL: https://inconsolation.wordpress.com/2015/04/23/rmlint-the-potential-to-purge/

Inequality - 02:03

Ads by Project Wonderful! Your ad could be here, right now.

Emily is the Burger Queen

URL: http://questionablecontent.net/view.php?comic=2945

Wednesday, 22 April 2015

x_x: The Dead Guy CLI - 12:00

With barely a week left for this site, I’m beginning to trim away programs that I just probably won’t get to, by virtue of time or technical dilemmas. I’m also making a conscious effort to pick out titles that amuse me in one form or another, so I finish with happy memories. :P

x_x, which I mentally refer to as “the Dead Guy CLI,” because the home page uses that as a subtitle, is a rather nifty tool that I’m surprised I haven’t seen covered elsewhere. Using a bland, dull, boring Excel spreadsheet borrowed from a corner of the Interweb, Dead Guy CLI transmogrifies it into this:

Well isn’t that clever.

Dead Guy CLI gives you a small measure of control over your output, by allowing you to specify a header row or allow for special encoding. It also works with CSV files, so you’re not strapped trying to convert back and forth to Excel, just to fiddle with x_x.

Aside from that though, Dead Guy CLI seems very simple. Of course, your spreadsheet may need some management if you expect it to fit into a certain dimension, but I am confident that as a skilled and capable member of the information age, you won’t throw a wobbly over a pear-shaped spreadsheet.

Keep x_x in mind when you’re thinking about things like csv2xls or xlhtml, since it may save you a step or prevent you from relying on graphical tools just to extract data from a spreadsheet. And of course, if you’re working with csv files, x_x could supplement what tabview or other tools can do.

For my own recordkeeping, Dead Guy CLI gets points for doing something obvious that I don’t recall seeing elsewhere. And also for the snarky name. I’m a fan of snarky names. :twisted:

Tagged: ascii, change, chart, convert, csv, data, excel, file, spreadsheet, xls

URL: https://inconsolation.wordpress.com/2015/04/22/x_x-the-dead-guy-cli/

Pending Request - 02:27

Ads by Project Wonderful! Your ad could be here, right now.

GOODE JOEB

URL: http://questionablecontent.net/view.php?comic=2944

Tuesday, 21 April 2015

greg: You’re so vain - 13:15

Everyone named “Greg” out there in the world can now sit up straight and imagine this little program is named in their honor.

I was introduced to greg after yesterday’s note about podcastxdl, and in spite of its lack of color and command-action-target input style, I think I like it better than the latter.

Of course, that screenshot isn’t very interesting, but what you see there is a lot of the way greg works. It maintains a list of podcasts and addresses, and you can wrangle them with fairly straightforward actions.

greg add adds to that list. greg remove drops it off, after you confirm it. greg check sees if anything is updated, and greg sync synchronizes your local folder with what’s available online. Like I said, it’s fairly straightforward.

I don’t see anything offhand that disappoints me about greg. I ran into no errors except when I fed it an invalid link, and it warned me that it wasn’t going to work. And aside from the lack of color and lack of an “interface,” it seems to work perfectly without my empty-headed suggestions.

So there’s greg, which we can add to the meager list of podcast aggregators for the console. Now do you see it? “greg”? “aggregator”? Aha. … ;)

Tagged: audio, download, manager, player, podcast

URL: https://inconsolation.wordpress.com/2015/04/21/greg-youre-so-vain/

When In Doubt, Science - 04:36

Ads by Project Wonderful! Your ad could be here, right now.

For the record, I disagree with all three of them- day-old fried chicken is the superior leftover food.

URL: http://questionablecontent.net/view.php?comic=2943

Friday, 17 April 2015

Work It Work It - 03:38

Ads by Project Wonderful! Your ad could be here, right now.

Claire you dork

I am in Calgary for Calgary Expo! Come by table 722 and I will be happy to draw for you and sell you stuff!

URL: http://questionablecontent.net/view.php?comic=2941

Thursday, 16 April 2015

Cool Story, Bro - 04:04

Ads by Project Wonderful! Your ad could be here, right now.

I am in Calgary for Calgary Expo! Come say hi!

URL: http://questionablecontent.net/view.php?comic=2940

Wednesday, 15 April 2015

Spontaneous Indeed - 02:05

Ads by Project Wonderful! Your ad could be here, right now.

I have been watching a lot of Jojo's Bizarre Adventure lately

URL: http://questionablecontent.net/view.php?comic=2939

Tuesday, 14 April 2015

Dynamic Software Updating: Linux 4.0 and Beyond - 18:00

Last month, Steven J. Vaughan-Nichols of ZDNet alerted us that Linux 4.0 will provide support for “no-reboot patching.” The gist: When a security patch or other critical OS update comes out, you can apply it without rebooting.

While rebootless patching is convenient for everyone, it’s a game changer for some applications. For example, web and cloud hosting services normally require customers to experience some downtime while the OS infrastructure is upgraded; with rebootless patching, upgrades happen seamlessly. Or, imagine upgrades to systems hosting in-memory databases: Right now, you have to checkpoint the DB to stable storage, stop the system, upgrade it, restart it, read the data from stable storage, and restart service. Just the checkpointing and re-reading from disk could take tens of minutes. With rebootless patching, this disruption is avoided; cf. Facebook’s usage of a modified memcached that supports preserving state across updates.

I’m particularly excited by this announcement because I’ve been working on the general problem of updating running software, which I call dynamic software updating (DSU), for nearly 15 years. In this post, co-authored with my PhD student Luís Pina, I take a closer look at the challenge that DSU presents, showing that what Linux will support is still quite far from what we might hope for, but that ideas from the research community promise to get us closer to the ideal, both for operating systems and hopefully for many other applications as well.

Dynamic software updating: What is it?

We have all experienced normal, or static, software updates: Download the new code, stop the affected program if it’s running, apply the patch, and restart the program. What makes dynamic software updates different is that they avoid the stop-and-restart part by also updating the running program’s execution state. This state consists of data, like linked tree and list structures that store our in-memory database, and control, like the execution stacks of active threads. The program code assumes the state adheres to a certain format and invariants, and therefore changing the code at run-time requires changing the execution state appropriately.

Dynamic Software Updating

Consider an example. If in the updated program the entries of a hash table are extended with a timeout field, then a dynamic update needs to convert in-memory hashtable entries to now contain a timeout field; otherwise, when the updated code goes to access that field, it will behave unpredictably. Or, if the new code uses two threads to perform some functionality which in the old version requires only one thread, then we need to map the existing thread’s stack to an equivalent one for the new code, and start a new thread to handle the extracted functionality. Changes to in-memory data, like the first example, we call data migrations, while changes to control, like the second example, we call control migrations.

Rebootless patching in Linux 4.0

The rebootless patching support in Linux 4.0 is the descendant of two existing proposals, kpatch (from RedHat) and kGraft (from SUSE). ¹ These two descend from earlier research, by Jeff Arnold and Frans Kaashoek, on a solution called Ksplice, which was bought by Oracle in 2011.

These solutions focus on dynamic changes to code. The basic approach to dynamically updating a function f is to overwrite the first few instructions of the current f to jump to its new version. This approach has the benefit that any references to f elsewhere in the code or data (i.e., as function pointers) will still work.

Activeness checking in kpatch

Function f cannot be running when this change takes place, for obvious reasons. Kpatch additionally requires that changed functions are not active, meaning they are not referenced by any process’s call stack at the time of an update (which it checks after pausing all processes). Why is this useful? The assumption is that the control state of the old and new kernel will be the same when no changed functions are active. As such, delaying the update until this condition is satisfied means the control state of the running kernel is valid for the new code. This makes it easier on the programmer.

Unfortunately, this assumption fails to account for the effects of prior execution of changed functions on the program’s data, even if the role of that data is the same between versions. Consider the following example (distilled from our prior work on dynamically updating OpenSSH daemons):

Old version:

void go() {
  setup(); /* update here */
  handle();
}
void setup() {
}
void handle() {
  global_ptr = init;
  x = (*global_ptr).field;
}

New version:

void setup() {
  global_ptr = init;
}
void handle() {
  x = (*global_ptr).field;
}

Consider that a process is executing function go, which was not patched, right after function setup returned and before calling function handle. This means that the old version of function setup ran, which did not set the variable global_ptr. But after the update, the new version of function handle will execute, which will dereference the global variable and crash. A research study we did found that kpatch-style activeness checking is not sufficient to ensure that the control and data state is correct, and moreover can be quite restrictive.

Multi-version execution in kGraft

KGraft tries to address this problem by ensuring version consistency. That is, when performing a system call, the kernel either executes old code or new code, but not a mix of both. KGraft enforces version consistency on a per-process basis, so it is possible for one process to execute the new code while another process executes the old code. When a process makes a system call after the patch is installed, kGraft sets a “new universe” flag on that process. From that point on, that process will always use the patched code. KGraft uses an extra level of indirection called a “reality-check” to decide, at the entry of patched functions, which code to execute based on the process flag. Once the flag is set on all processes, kGraft drops the now-redundant indirection and jumps straight to the patched code.

The problem with multi-version execution is that processes running two different code versions could interact, e.g., through common data structures, and thereby potentially violate new (or outdated) invariants. This problem would be particularly acute if the old and new version changed a data structure’s format. ²

(Lack of) data updates

None of these kernel live-patching mechanisms support updating data, at least not fully or easily. Ksplice proposed updating data using shadow data-structures (based on DynAMOS by Makris and Ryu). Given that the original data-structure has no space for new fields, the idea is to create a separate data-structure just for them and rewrite the original binary code to use the new structure when manipulating the new fields. This approach, however, introduces a lot of complexity (and opportunities for bugs) as the code is maintained and patched further, going ahead.

Full-featured DSU

Linux 4.0 DSU support is a far cry from supporting Vaughan-Nichols’ hope that “With Linux 4.0, you may never need to reboot your operating system again:” it is simply not flexible enough. The goal of DSU is to avoid stops-and-restarts; thus, ideally, any updates to a program we can make statically we can also make dynamically. I.e., we should be able to add new functions, change function types (e.g., to have new or different arguments), or modify data structures, e.g., by adding new data elements, breaking apart or changing the types of existing elements, adding and removing pointers, etc. But Linux 4.0’s rebootless patching support limits flexibility for the sake of better performance, backward-compatibility, and ease of use by the kernel programmer, even though the latter is not quite satisfying, as we have discussed: ironically, a “rebootless patch” could result in a crash, defeating the point!

The research community has been looking at how to support highly-flexible DSU for many years now. ³ We view whole-process DSU, pioneered by Makris and Bazzi’s UpStare system, as the most promising approach for user-space programs, owing to its flexibility. In this approach, the entirety of the new code is loaded into the memory of the running process and then control and data migrations directly update the execution state prior to, or even in conjunction with, subsequent execution that code.

Kitsune and Rubah

We have developed two systems that take the whole-process approach: Kitsune, for C programs, and Rubah, for Java programs. Both systems are flexible enough to dynamically apply years’ worth of release-level updates, and they have been used to dynamically update substantial applications such as redis, memcached, snort, H2, and Voldemort. Besides their flexibility, these systems do not impose any measurable overhead on normal execution. This is because the whole-process approach allows the code to be fully optimized internally — such optimizations would be inhibited by added levels of indirection (like trampolines) and spatial non-locality common in per-function updating approaches. ⁴

The downside of both approaches is additional programmer work in supporting both control and data migration.

First, they require the programmer to add update points to the program. These are points at which the program polls to see whether a dynamic update is available, and if so will start applying it. Programs typically have only a handful of update points, and they are naturally placed at the start of long-running loops, when invariants are established and/or events have been fully handled.

Second, the programmer must define state transformation functions which indicate how data from an old version of a type/class should be used to initialize the updated version. For example, the state transformer might provide the default value for a new field. Fortunately, the process of finding and updating updated data values is automated. Kitsune updates all data at once, using a garbage collector-style mechanism. Rubah can do likewise, but also supports on-the-fly data migration, updating each outdated object when the new code first accesses it after the update. Both systems provide simple automation for writing state transformers, too, but sometimes the programmer needs to get involved.

Third, the programmer must modify the program to support control-flow migration between versions. Kitsune and Rubah start running the new program from the equivalent threads’ entry points (after data is migrated, or initiated in the on-the-flly case). To avoid re-running initialization code, the program can skip it, conditioned on being in updating mode; this mode is disabled once the equivalent update point is reached in the new program. We find that making control migration changes to the program is relatively simple because update points tend to be shallow in the control flow graph, and initialization code is often skipped en masse. For example, redis required only 2 extra LOC, and memcached required 9 LOC. Even better, this code rarely changes, so it is basically a one-time effort, rather than a per-update effort.

In the end, this work amounts to only hundreds of lines of code (compared to tens or hundreds of thousands of lines in an application), and much of the work is done once. And the benefit is full-featured dynamic updates with excellent performance.

What’s next?

Is it possible to apply the techniques from whole-process DSU to the OS kernel? One possibility is to apply “whole-virtual machine” updates—virtual machines are to virtual machine monitors (VMMs) what processes are to operating systems. Indeed, a recent criticism of the Linux 4.0 DSU support points to this direction: “Rather than trying to patch a running kernel …, why not just save the entire state of the system, boot into an entirely new kernel, then restore the previous state on top of the new kernel?”

The idea of non-stop operation is appealing outside of standalone user programs and OS kernels. What about updating elements of a distributed system in a way that the communication protocol is changed? Or, what about updating the schema and contents of a database (so-called “schema migration”) while migrating the applications that use it (e.g., and avoid the 22 hour downtime experienced by Mediawiki)? We can also imagine updating cloud-based web applications—the updates can be pushed silently to the users even while the applications are in use. All of these changes would facilitate more ready application of functionality or performance improvements, and of security fixes. Once you realize that dynamic updates are not that different from static ones—they just require a little more work to update the execution state—you realize that the benefit of non-stop operation is very much within reach.

Notes:

The new live kernel patching support that Linux 4.0 introduces is a common core from kpatch and kGraft. The new API allows modules that contain patches to be loaded, listed, and removed. It also performs the low-level redirection to replace patched functions. It still remains to bring in some kind of safety checking, as kpatch and kGraft differ on this. ↩
The multi-version execution approach was considered, for process-level dynamic updates, by the POLUS dynamic updating system. ↩
Good DSU surveys include Segal and Frieder’s 1993 paper, Ajmani’s 2002 survey, and the related work of the Kitsune journal paper. ↩
Note that Upstare imposes substantial execution overhead because it compiles the program specially to perform stack re-winding, for control migration; Kitsune and Rubah favor programmer-provided support, which turns out to be much more efficient. ↩

The post Dynamic Software Updating: Linux 4.0 and Beyond appeared first on The Programming Languages Enthusiast.

URL: http://www.pl-enthusiast.net/2015/04/14/dynamic-software-updating/

Olfactory - 02:04

Ads by Project Wonderful! Your ad could be here, right now.

Tycho Brahe would be jealous

URL: http://questionablecontent.net/view.php?comic=2938

Monday, 13 April 2015

Clinton's Bizarre Adventure - 02:32

Ads by Project Wonderful! Your ad could be here, right now.

It's Elliott! We haven't seen Elliott in a while! Hi Elliott.

URL: http://questionablecontent.net/view.php?comic=2937

Sunday, 12 April 2015

Spiders vs. the Sun - 00:00

Spiders vs. the Sun

Which has a greater gravitational pull on me: the Sun, or spiders? Granted, the Sun is much bigger, but it is also much further away, and as I learned in high school physics, the gravitational force is proportional to the square of the distance.

—Marina Fleming

Note: This is a spider-heavy article. I can be a little anxious about spiders myself, so my research for this article involved a lot of opening PDFs while squinting and leaning back from the screen. If you're a serious arachnophobe, you might want to skip this one.

In the literal sense, this question is totally reasonable, although it would be easy to rephrase it to be completely incoherent.

Is the Sun heavier than spiders in terms of closeness?

The gravitational pull from a single spider, no matter how heavy, will never beat out the Sun. The goliath bird-eating spider[1]Wikipedia helpfully notes that despite its name, it "only rarely preys on birds, with the exception of young birds." weighs as much as a large apple.[2]This is correct whether I mean the fruit or the iPhone 6+; the spider weighs about as much as each. Even if, God forbid, you were as close as possible to one of them, the pull from the Sun would still be 50 million times stronger.

The strength of the force seems to vary inversely with distance, although with some kind of lag; once I start running, I don't want to stop for a little while.

What about all the spiders in the world?

There's a well-known factoid that claims you're always within a few feet of a spider. Is this true? Arachnologist[3]Spiders, not ancient pottery.[4]Unless the ancient pottery is full of spiders. Chris Buddle wrote a good article on this question; as you might expect, it's not literally true. Spiders don't live in the water,[5]With the exception of Argyroneta aquatica so you can get away from them by swimming, and there aren't as many spiders in buildings as in fields and forests. But if you're anywhere near the outdoors, even in the Arctic tundra, there are probably spiders within a few feet of you.

Regardless of whether the factoid is precisely true or not, there are an awful lot of spiders out there. Exactly how many is hard to say, but we can do some rough estimation. A 2009 study actually measured the total mass of spiders in sample areas in Brazil. They found one-digit numbers of milligrams of spider[6]SI unit: the "AAAAAAAAAAAAAAAAAAAAAAAAAAAA", abbreviated "AAAAAA". per square meter of forest floor.[7]That's dry mass; you have to divide by a number around 0.3 to get the live weight. If we guess that about 10% of the world's land area hosts this density of spiders, and there are none anywhere else, we come up with 200 million kilograms worldwide.[8]One survey of fields and pastures in New Zealand and England tended to find two-digit numbers of spiders per square meter. If they each weigh about a milligram, and we assume once again that about 10% of Earth's land supports that density of spiders, that gives a total spider biomass of 100 million to a billion kilograms. That agrees with our first estimate, at least.

Even if our numbers are off wildly, it's enough to answer Marina's question. If we assume the spiders are distributed evenly across the surface of the Earth, we can use Newton's shell theorem to determine their collective gravitational pull on objects outside the Earth. If you do that math, you find that the Sun's pull is stronger by 13 orders of magnitude.

Now, this calculation makes some assumptions that aren't true. Spider distributions are discrete, not continuous,[9]Spiders are quantized. and some areas have more spiders than others. What if there happen to be a lot of spiders near you?

Although three spider feet is only 9 inches.

In 2009, the Back River Wastewater Treatment Plant found themselves dealing with what they called an "extreme spider situation." An estimated 80 million orb-weaving spiders had colonized the plant, covering every surface with heavy sheets of web.[10]Which was in turn covered in heavy sheets of spider. The whole thing is detailed in a fascinating and horrifying article published by the Entomological Society of America.[12]The conclusion of the article contains this breathtaking piece of prose:

Our recommendations for amelioration included the following general points:
1) On-site personnel should be reassured that the spiders are harmless and the facility's immense shroud of silk should be presented in a positive light as a record-breaking natural history wonder.

What was the total force of gravity from all those spiders? First we need their mass; according to a paper titled Sexual Cannibalism in Orb-Weaving Spiders: An Economic Model, it's about 20 grams for males and several times that for females.[11]Not to be confused with Trade-off between pre- and postcopulatory sexual cannibalism in a wolf spider, which is a different but equally real paper. So even if you were standing next to the Black River Wastewater Treatment Plant in 2009, the pull of all the spiders inside would still be only 1/50,000,000th that of the Sun.

No matter which way you look at it, the bottom line is that we live our lives surrounded by tiny spiders on a world completely dominated by a gigantic star.

Hey, at least it's not the other way around.

Do you think there's life around other spiders?

URL: http://what-if.xkcd.com/136/

Friday, 10 April 2015

No More Fucks To Give - 02:35

Ads by Project Wonderful! Your ad could be here, right now.

This is going downhill in a hurry

URL: http://questionablecontent.net/view.php?comic=2936

Thursday, 09 April 2015

Self-Reflection - 03:23

Ads by Project Wonderful! Your ad could be here, right now.

Oh, Clinton. Poor, stupid Clinton.

URL: http://questionablecontent.net/view.php?comic=2935

Tuesday, 07 April 2015

Game Over Man, Game Over - 02:30

Ads by Project Wonderful! Your ad could be here, right now.

And he was doing so well.

URL: http://questionablecontent.net/view.php?comic=2933

Monday, 06 April 2015

Several Hours Earlier... - 02:25

Ads by Project Wonderful! Your ad could be here, right now.

Clinton has a roommate! I guess it is sort of an odd couple type deal.

URL: http://questionablecontent.net/view.php?comic=2932

Friday, 03 April 2015

How Are You I Am Fine - 02:13

Ads by Project Wonderful! Your ad could be here, right now.

DAMN, CLINTON

URL: http://questionablecontent.net/view.php?comic=2931

Thursday, 02 April 2015

She Can't Help Herself - 02:42

Ads by Project Wonderful! Your ad could be here, right now.

Gotta get in those puns

URL: http://questionablecontent.net/view.php?comic=2930

Wednesday, 01 April 2015

Every Day Carry - 04:24

Ads by Project Wonderful! Your ad could be here, right now.

URL: http://questionablecontent.net/view.php?comic=2929

Tuesday, 31 March 2015

Penetrating Questions - 04:33

Ads by Project Wonderful! Your ad could be here, right now.

Claire is being kind of a jerk!

URL: http://questionablecontent.net/view.php?comic=2928

Friday, 27 March 2015

Back To The Well - 02:48

Ads by Project Wonderful! Your ad could be here, right now.

This is starting to become a thing.

Emerald City Comic Con! Seattle! This weekend! I will be at the Topatoco booth on the sky bridge! Come say hi and buy some fun things!

URL: http://questionablecontent.net/view.php?comic=2926

Thursday, 26 March 2015

Recovery Period - 04:18

Ads by Project Wonderful! Your ad could be here, right now.

HANNERS NO

I am in Seattle for ECCC this weekend! Come find me at the topatoco table on the sky bridge!

URL: http://questionablecontent.net/view.php?comic=2925

Wednesday, 25 March 2015

Interview with Go’s Russ Cox and Sameer Ajmani - 17:32

In this post I interview Russ Cox and Sameer Ajmani, who work at Google on the Go programming language. They share with me their path to working on the language, what they find unique and valuable about it, and plans for it going ahead.

This continues our series on PhDs in industry working on programming languages (Avik Chaudhuri was the first). Thanks to Russ and Sameer for taking the time to share their experiences!

What is your academic background?

Russ: I completed bachelors and masters degrees in computer science at Harvard, and then I completed a PhD at MIT, working in the Parallel and Distributed Operating Systems group led by Frans Kaashoek and Robert Morris. When I arrived at MIT, the distributed systems community had started focusing on peer-to-peer networks, and so I spent a few years on related topics. For my PhD thesis I looked into the problem of how to design an extension mechanism for compilers so that extension plugins could reasonably coexist and be composed.

Sameer: I completed by bachelors in Computer Science at Cornell during 1994-1998, and completed a Masters & PhD at MIT between 1998-2004. My advisor at MIT was Barbara Liskov. My Masters research considered A Trusted Execution Platform for Multiparty Computation, and for my PhD I worked on Automatic Software Upgrades for Distributed Systems. I joined Google in 2004, and the Go team in 2012.

Russ: I grew up near Bell Labs, and during high school and college I had the good fortune to be able to spend time there hanging out in the computer science department, the birthplace of Unix and C but also lex, yacc, and the Dragon Book. At some level, knowing how to put together a language (little or big) and compiler was just part of the culture. That almost certainly led to my interest in making nicer programming environments for myself.

What is the origin story for Go, and your involvement in its development?

Russ: Robert Griesemer, Rob Pike, and Ken Thompson started designing Go at Google in late 2007, to address the software development problems Google was facing.

The first problem is that networked systems are getting larger, with more interaction between more pieces, and existing languages were not making those interactions as smooth to express as they could be. The solution was to adopt the model of Hoare’s Communicating Sequential Processes (CSP). That may sound risky, but Rob and Ken had experience with a sequence of languages done at Bell Labs that used CSP to good effect.

The second problem is that programs are getting larger and development more distributed. A language that works very well for a small group may not work as well for a large company with thousands of engineers working on many millions of lines of code. When you’ve got that many engineers working together, you want to make sure the language doesn’t have dusty corners that only a few people know how to use well. A code base that size is so large that even maintenance requires mechanical help, and existing languages weren’t designed with that in mind. Even mundane issues like how many files must be read in order to build a program matter at that scale. All of those considerations, and more, led to the idea of hosting CSP in a new language instead of trying to bolt it onto an existing one.

I was finishing my PhD in spring 2008 and visited Google. I had worked with Rob at Bell Labs, and both he and Ken told me about Go, and I was hooked. When I joined the team in August, the language was still just a prototype, with almost no library. I took over the compiler and runtime, and I got to help to develop the standard library and all the revisions and refinements to the language prompted by that experience. Today, Rob and I lead the overall Go project at Google together.

Sameer: In early 2011, I attended a workshop given by Rob Pike about Go at Google NYC and been particularly impressed with the language’s concurrency features. I was working on the indexing pipeline for Google Maps at the time and writing code that dealt with replicated storage systems. I had a library in C++ to issue storage operations to F+1 replicas and fail over to additional replicas as operations failed or timed out; it was about 700 lines. I converted this code to Go and it was under 100 lines and far more comprehensible. I was sold. I started spending 20% of my time improving the Go’s libraries within Google. The team noticed, and in Fall of 2011 invited me to join Go full time. I joined the team in January 2012. At that point, the language was almost finalized; Go 1.0 was released March 28, 2012. My role was to make Go useful for building production systems inside Google.

In your view, what are some key things that distinguish Go from prior languages?

Russ: The most obvious thing that distinguishes Go is the focus on CSP, with lightweight threads of control (we call them goroutines) communicating by passing messages on channels. That was not a common model at all when Go launched. Erlang is the closest, but even in Erlang there are no explicit channels (it is more like the original CSP paper than Hoare’s followup work).

Sameer: Most mainstream languages provide concurrency support using libraries. Having it built into the language makes concurrent code easier to read and write (for example, see the use of closures in the Digesting a tree example), and the compiler and runtime (scheduler, GC) can make concurrent programs execute well.

Relatedly: What kind of application is a great fit for Go, but may not be for other languages?

Sameer: Go is a great fit for concurrent programs, especially servers. Request handlers in servers are mostly independent and so are naturally expressed as separate threads of execution. Go provides goroutines, which are extremely lightweight threads (4KB stacks that grow as needed). It’s common to have programs with hundreds of thousands of goroutines. Go provides channels to allow goroutines to communicate (synchronize and exchange data values) and a select statement to allow goroutines to wait on multiple communication events.

Russ: I agree with Sameer: Because of the strong support for concurrency, I think Go is a great fit for network clients or servers that are dealing with many different input sources or other events happening all at once. And because of the focus on software engineering scale, I also think Go is a great fit for any program that’s going to be worked on by more than a handful of engineers or grow to more than a few thousand lines of code. Now that the Go compiler has been moved from C to Go, I can finally say that I use Go for essentially all my day-to-day programming. It’s wonderful.

How has the research literature in programming languages influenced Go’s design (both positively and negatively), if at all?

Russ: Go’s design was influenced primarily by practical experience, particularly that of the original designers. Robert had worked on implementations of Modula-3, Strongtalk, and JavaScript before Go, and Rob and Ken had worked on Unix and C and the CSP languages I mentioned earlier. Those are the strongest direct influences, but there were certainly others. To take a trivial example, the way Go does semicolon insertion was actually taken from BCPL, with JavaScript serving as a warning of what not to do.

Go is more an engineering project than a pure research project. Like most engineering, it is fundamentally conservative, using ideas that are proven and well understood and will work well together. The research literature’s influence comes mainly through experience with its application in earlier languages. For example, the experience with CSP applied in a handful of earlier languages—Promela, Squeak, Newsqueak, Alef, Limbo, even Concurrent ML—was just as crucial as Hoare’s original paper to bringing that idea to practice in Go. Programming language researchers are sometimes disappointed that Go hasn’t picked up more of the recent ideas from the literature, but those ideas simply haven’t had time to pass through the filter of practical experience.

What are some of the features of Go that are important in practice but undervalued in the research literature?

Russ: I think programming language researchers sometimes underestimate the practical importance of simplicity and predictability in a language feature, especially in incorrect programs.

To take one example, Hindley-Milner type inference is very well understood and I think generally accepted in the programming language community as a good feature. But new ML programmers inevitably hit the situation where changing one part of a program causes a mysterious type error in a seemingly unrelated part. The solution is that in practice one writes type signatures for most functions. Hindley-Milner type inference is beautiful research, but it’s not as beautiful in practice. In Go, the only type inference is that if you say var x = e, the type of x comes from e. This rule is simple and predictable and doesn’t suffer from the kinds of “spooky action at a distance” of more complex type inference algorithms. I miss having more complex inference sometimes, but Go’s simple rule handles at least 90% of the cases you care about, with none of the unpredictability or mystery.

Sameer: The focus on software engineering aspects is just as important and often overlooked. For example, Go is amenable to machine transformation. We’ve used this to enforce automatic formatting and automatically update the import lines for a source file. As a result, Go programmers just write their code, then a tool updates their import lines as needed to pull in whatever packages are needed. We are working on new tools to automatically simplify code and change function signatures to plumb request-scoped data. These features allow Go to scale to large code base sizes and large teams and allows us to continue improving the quality of existing code even after the developers have moved on to other projects.

What’s the current state of the language, and what are some short term and long term goals?

Russ: The language is basically done for now. We’ve committed to stability and backward compatibility as language features.

Our short-term focus is on improving the implementation and making Go run in more places. We recently converted the compiler from C to Go (mechanically), and now we’re starting to think about adding SSA-based optimizations. In the runtime, the main focus is on implementing a concurrent garbage collector with bounded pauses. We’re also looking at making Go work for all the places that people run code today, from networked servers to mobile devices and everything in between.

In the long term, we want to make sure that Go continues to be a stable, reliable platform for people to get work done. If we keep doing that, we should keep attracting new programmers and growing the community.

What is your view of the value of a PhD when working at a company like Google? Why should people pursue a PhD if they are not going to end up at a University or research lab?

Sameer: Companies like Google value individuals who are self-directed and, in particular, who can think carefully about an underspecified problem, do independent research to find a solution, then create and execute a plan to implement that solution (usually using a team of people). The process of working through a PhD demonstrates many of these skills, but not all of them. In particular, PhD students many not know how to divide up a plan and delegate to teammates, and they might not know how to take a program that works well for small test cases and make it work well at scale in a production environment. These latter skills are often learned on the job or through side projects.

Russ: Fundamentally, learning how to do research is learning how to identify, develop, evaluate, and present new ideas. Most technology companies start with new ideas, so that’s a great fit. I don’t think it’s a coincidence that there are so many companies started by grad school dropouts.

But the new ideas don’t stop there. Computer hardware improves at such an incredible rate that, especially at a company like Google, software engineers having to be exploring new ideas to keep up and make the best use of that hardware. Much of the software development at Google has a research component, and papers Google has published about what were first and foremost development efforts have nonetheless had significant impact on the research literature. For more about that, I suggest Alfred Spector, Peter Norvig, and Slav Petrov’s paper, Google’s Hybrid Approach to Research. And there is significant research at companies across the technology industry, not just at Google.

It is true, however, that research in the technology industry is inherently focused on ideas with practical applications, which makes it somewhat narrower in scope than academic research. There are positive and negative aspects to that.

What are some key lessons you’ve learned (about anything at all!) while working on Go?

Sameer: A small, highly-skilled team with a few talented designers is better than a large team with lots of opinions. A small, useful language with a few powerful features is better than a large language with lots of features. There’s great value in keeping a language small: it makes code much easier to read, write, and maintain, since there are fewer reasonable ways to accomplish a task.

Russ: The most important thing I learned is that a successful programming language is about far more than the language itself. ¹ We spent a lot of time on the language definition and implementation, but we’ve spent even more on making it easy to get started with Go, trying to write good documentation, making sure that the right tools are in place for people to collaborate and share code, and cultivating active but respectful discussion lists for users and developers. I’m particularly grateful to all the excellent developers who have joined us in using and working on Go. The contributions from the open source community have been truly amazing!

Follow Russ and Sameer on Twitter at @_rsc and @sajma, respectively.

Notes:

This jives with data we presented in an earlier post about language adoption, which founds that developers rate libraries as the top feature of a language ecosystem. ↩

The post Interview with Go’s Russ Cox and Sameer Ajmani appeared first on The Programming Languages Enthusiast.

URL: http://www.pl-enthusiast.net/2015/03/25/interview-with-gos-russ-cox-and-sameer-ajmani/

Telling Stories - 01:15

Ads by Project Wonderful! Your ad could be here, right now.

I will be at Emerald City Comic Con this weekend! Come find me at the Topatoco table, booth #1102 on the skybridge! I will have books and shirts and art and more for sale!

I ALSO have a new t-shirt that you should buy if you like tacos:

URL: http://questionablecontent.net/view.php?comic=2924

Tuesday, 24 March 2015

Opening Up - 02:00

Ads by Project Wonderful! Your ad could be here, right now.

Emerald City Comic Con is this weekend! I will be there at the Topatoco table! Come say hi!

Also I have a new shirt for sale:

URL: http://questionablecontent.net/view.php?comic=2923

Monday, 23 March 2015

Provision Of Comfort - 02:09

Ads by Project Wonderful! Your ad could be here, right now.

It is pretty wonderful.

I will be at Emerald City Comic Con in Seattle this weekend! Come find me at the Topatoco table! It will be good times!

URL: http://questionablecontent.net/view.php?comic=2922

Friday, 20 March 2015

Pseudonyms - 02:51

Ads by Project Wonderful! Your ad could be here, right now.

Ain't no baroness like an insane baroness

URL: http://questionablecontent.net/view.php?comic=2921

Thursday, 19 March 2015

No Backsies - 01:08

Ads by Project Wonderful! Your ad could be here, right now.

New Deathmole album! Buy it now!

URL: http://questionablecontent.net/view.php?comic=2920

Wednesday, 18 March 2015

Support Support - 02:06

Ads by Project Wonderful! Your ad could be here, right now.

Do you remember the Deathmole Kickstarter from a while back? One of the stretch goals was to re-record my old album Advances, and the finished version is finally out! You can listen to and buy it here. People who backed the kickstarter can expect an email with a free download code soon!

URL: http://questionablecontent.net/view.php?comic=2919

Tuesday, 17 March 2015

Promotional Business - 01:31

Ads by Project Wonderful! Your ad could be here, right now.

URL: http://questionablecontent.net/view.php?comic=2918

Monday, 16 March 2015

Teaching and Researching Programming Languages at a Liberal Arts College - 14:20

During my tenure as a student and professor, I have been to many talks offering career advice to graduate students. Most of these talks focus on careers in research universities and industrial research labs, and leave out discussion of institutions, such as liberal arts colleges, that are primarily concerned with undergraduate education. This is unfortunate because many liberal arts colleges are highly selective institutions that offer exciting careers that mix research and teaching, albeit in a different way than careers in research universities.

One way to reduce the information deficit about liberal arts colleges is to report on the experiences of those who work at one. This is what we do in the present post. Specifically, I interview Steve Freund, who is a professor of computer science at Williams College, ranked by US News as the top liberal arts college in America. Steve is a highly successful PL researcher, known for his significant contributions to the analysis of concurrent programs. As a result, he is in a great position to give PL Enthusiast readers a view into what it’s like to be a teacher and researcher at a liberal arts institution.

Tell us a bit about your background and your research.

I am currently a Professor of Computer Science at Williams College. Williams is a small, selective liberal arts college of about 2,200 undergraduates in Western Massachusetts.

I began my academic career as an undergraduate at Stanford University. Those were quite formative years, both in terms of my research trajectory and my eventual career path. I spent many quarters as a section leader for the introductory computer science courses, which exposed me to the challenges and satisfactions of teaching early on and also motivated me to work with Eric Roberts on a programming environment designed specifically for introductory students. That project in turn led to my interest in programming languages, which I continued to study at Stanford as a graduate student with John Mitchell.

After defending my thesis, I spent several years as a Research Scientist at the DEC (and then Compaq) Systems Research Center in Palo Alto before moving to Williams in 2002. Since that time I have focused primarily on software reliability and program analyses to enforce various correctness properties for concurrent programs, including, for example, race freedom, atomicity, and cooperability. I draw on static, dynamic, and hybrid techniques, and my work typically involves both the theoretical study of new analyses as well as their implementation and validation.

In teaching at Williams, you are different from most other top-tier researchers in our field, who tend to work out of research universities or industry labs. At the same time, your education was entirely at a research university (Stanford). What made you decide to go to Williams?

Liberal arts colleges were not even on my radar when I finished my PhD. I knew they existed but didn’t really appreciate what they are all about. I did have a sense of what life would be like at a research university — especially since both my father and brother are professors at research universities — but research universities didn’t feel like the right environment for me. That led me to join SRC, where I was already working as a consultant and where I knew I’d continue to have a great time collaborating with many excellent colleagues at the lab.

While my time at SRC was quite productive in many ways, it wasn’t long before I began to miss being in an academic environment. One day it occurred to me to call Kim Bruce and ask him about teaching at a college like Williams, where he had been for many years. After chatting with him, I began to think that such a school may offer a career balance closer to what I wanted, namely the ability to not only pursue my own research but also to teach and work closely with undergraduates, which is the part of teaching that I felt most excited about. Williams was hiring that year, and things worked out as well as I could have hoped. In many ways I have Kim to thank for demonstrating the full potential of a career at a liberal arts college and for encouraging me to pursue it.

How, in your experience, does working a liberal arts college like Williams compare with working at a research university?

The core mission of a liberal arts college is to educate undergraduates in the broadest sense possible and with a particular emphasis on critical thinking, communication, and reasoning skills. We don’t have a graduate program and, with eight faculty, my department is quite a small department by research university standards (but among the largest for liberal arts colleges!).

It shouldn’t be surprising that I spend a lot of time teaching and working with undergraduates, and we place a heavy emphasis on direct interactions between students and faculty and providing personalized learning opportunities. However, research does play a central role in what I do. Beyond enabling me to advance to the state-of-the-art, conducting research helps me maintain perspective on the field, stay engaged with the material I teach, and train the next generation of researchers.

Since I have no graduate students, I do need to be careful to follow a research agenda that’s feasible within my time and resource constraints, but so far that hasn’t been a real issue. I should say that maintaining external collaborations has been really important to me since moving to Williams. Since I’m the only PL person in town, my collaborations prevent me from feeling too isolated from the larger research community.

I like being in a small department and at a relatively small institution. We are able to foster a very welcoming and vibrant academic community, and I enjoy my daily interactions with faculty and students from across the entire college. Our size does, however, necessitate everyone pitching in to maintain our program and community, so I do end up taking on many service activities around the department and college.

What are the expectations at Williams about balance between teaching and research?

At Williams I’m able to maintain a fairly healthy balance between teaching and research, with roughly equal amounts of time and energy spent on each over the course of the year. My research time is concentrated during the summer months and over breaks since teaching’s daily activities and deadlines end up consuming most of my attention when classes are in session. However, even during the busiest semesters I can usually carve out enough time for research that I don’t completely lose momentum, though it does take discipline and attention to time management.

Sabbaticals are also key, both to recharge my batteries for teaching and also to immerse myself in research and long-term planning to an extent that’s hard to attain even over the summers.

In terms of research output and expectations, my goal is to produce the highest quality research possible, but obviously not at the same rate as colleagues at research universities where priorities may be different and where graduate students can assist in the work. Publishing perhaps one or two nice results a year is how I like to think in terms of long-term productivity. In my experience, that matches the personal expectations faculty have at liberal arts colleges like Williams, as well as the college’s expectations for advancement/tenure. Excellence in both teaching and research is highly valued, and both play significant roles in hiring and promotion decisions at Williams. That is true at many liberal arts colleges, but the relative importance of research and teaching does vary a lot from school to school.

What about external funding?

Funding through external grants is not as necessary as at many research universities, but it can definitely make life a bit easier by providing support for travel, equipment, research students, etc. So I do spend some time writing proposals and managing grants, but this is a fairly small fraction of my work when compared to faculty running larger research programs at universities.

I should also note that being at a liberal arts college does not preclude me from applying for funding from the NSF or other sources, and indeed there are a number of funding opportunities specifically for faculty at liberal arts colleges. At Williams, many other faculty also seek funding opportunities, but that is not universal or necessarily required.

What classes do you teach?

About half of the classes I teach are at the introductory level, typically either the first-semester programming course or the second course on data structures. Teaching introductory classes and their lab sessions keeps me pretty busy, but these classes are also a lot of fun, and I really enjoy working with students just starting out in our discipline.

I also teach an upper-level class on programming languages most years, as well as a tutorial on compiler design. Tutorials at Williams are modeled after those at Oxford and are one of our signature educational elements. They are small classes of ten students and have no lecture or seminar component. Students explore the material independently and then meet with me in pairs for an hour or two each week. In those meetings, the two students work through problems they were stuck on, explore alternative approaches, and discuss the topics in a larger context. (And since my tutorial is on compilers, they build a large compiler over the semester as well.) Tutorials are the sort of “high touch” teaching experience that I think distinguishes Williams from other types of institutions. I’m also starting to think about a new software methods and engineering course for students early in our major as well.

Do you involve undergraduates in your research?

Yes, providing research opportunities and preparing students to carry out original research is very important to me.

All of the faculty in my department typically work with one or more students over the summer as part of a sciences-wide undergraduate research program. My summer projects are often relatively self-contained pieces of whatever problems I’m thinking about. Those summer experiences often lead to year-long honors theses, which are much more substantial and open-ended projects entirely of the student’s own design.

Motivated and talented undergraduates can make great contributions to the type of research I conduct, and it’s gratifying to see many of my research students choose to continue their studies in PhD programs, but I think the research experiences I’m able to provide substantially benefit students regardless of whether they are bound for graduate school or some other endeavor.

What advice do you have for graduate students who may wish follow a career path similar to yours?

First, cast a wide net and look broadly at what opportunities are out there. Talk to people at various types of colleges and universities, and ask them about how they ended up where they are and how they spend their time. There is no single “right” way to have a career in computer science or in academia, and you may end up very excited by options you didn’t know existed.

Also, if you think you might like teaching and working with students, try it out. Volunteer to give a lecture or two for a class you’re TAing, offer to mentor an undergraduate interested in your area, etc. If those things pique your interest, try to find an opportunity to teach an entire course or take on other larger mentoring responsibilities. We don’t necessarily expect job candidates to already be exceptional teachers, but we do expect them to show potential and to have an enthusiasm for and a history of being engaged in teaching. These sorts of experiences help you demonstrate that when it comes time to write your teaching statement, and they will also guide you to a more informed decision about what comes next, regardless of what it may be.

I still remember several people telling me I’d be “throwing away my career” if I took a job at Williams. Others considering liberal arts colleges have told me of hearing similarly dire warnings. If you do hear that sentiment, recognize that people have different priorities and measures of success for their careers. For some, liberal arts colleges are indeed not a good fit. For others, however, they can offer exactly the set of challenges, opportunities, impacts, and satisfactions you may be looking for.

The post Teaching and Researching Programming Languages at a Liberal Arts College appeared first on The Programming Languages Enthusiast.

URL: http://www.pl-enthusiast.net/2015/03/16/teaching-and-researching-pl-at-a-liberal-arts-college/

Thursday, 12 March 2015

The Reckoning - 02:42

Ads by Project Wonderful! Your ad could be here, right now.

RIP

URL: http://questionablecontent.net/view.php?comic=2917

Wednesday, 11 March 2015

Morning Ritual - 01:43

Ads by Project Wonderful! Your ad could be here, right now.

honk honk

URL: http://questionablecontent.net/view.php?comic=2914

Tuesday, 10 March 2015

Life Coach - 01:24

Ads by Project Wonderful! Your ad could be here, right now.

URL: http://questionablecontent.net/view.php?comic=2913

Monday, 09 March 2015

Nothing Good Can Come - 00:58

Ads by Project Wonderful! Your ad could be here, right now.

Marten is all "nooooooope"

URL: http://questionablecontent.net/view.php?comic=2912

Friday, 06 March 2015

To Serve And Protect - 02:36

Ads by Project Wonderful! Your ad could be here, right now.

aww yeah ending this week on a classy note

URL: http://questionablecontent.net/view.php?comic=2911

Thursday, 05 March 2015

Listen Closely - 03:48

Ads by Project Wonderful! Your ad could be here, right now.

Mar-bear ain't give a fuuuuuck

URL: http://questionablecontent.net/view.php?comic=2910

Wednesday, 04 March 2015

Dummy - 02:42

Ads by Project Wonderful! Your ad could be here, right now.

you idiot

URL: http://questionablecontent.net/view.php?comic=2909

Tuesday, 03 March 2015

Toasty - 02:34

Ads by Project Wonderful! Your ad could be here, right now.

making toast is an intimate act

URL: http://questionablecontent.net/view.php?comic=2908

Monday, 02 March 2015

Peace Out - 03:08

Ads by Project Wonderful! Your ad could be here, right now.

Faye is very good at flipping tables

URL: http://questionablecontent.net/view.php?comic=2907

Friday, 27 February 2015

Possibly OVERqualified - 03:33

Ads by Project Wonderful! Your ad could be here, right now.

Reminder that I also have another comic that is this weird science fiction thing! It just keeps getting weirder!

URL: http://questionablecontent.net/view.php?comic=2906

Thursday, 26 February 2015

Winslow Is Helping - 03:12

Ads by Project Wonderful! Your ad could be here, right now.

URL: http://questionablecontent.net/view.php?comic=2905

Wednesday, 25 February 2015

It's Synthetic, Of Course - 02:51

Ads by Project Wonderful! Your ad could be here, right now.

flump

URL: http://questionablecontent.net/view.php?comic=2904

Tuesday, 24 February 2015

It's Normally Kept Neatly Folded - 03:05

Ads by Project Wonderful! Your ad could be here, right now.

URL: http://questionablecontent.net/view.php?comic=2903

Monday, 23 February 2015

Battle Manga - 03:53

Ads by Project Wonderful! Your ad could be here, right now.

Not frail, just weak

URL: http://questionablecontent.net/view.php?comic=2902

Friday, 20 February 2015

Awooooo - 04:46

Ads by Project Wonderful! Your ad could be here, right now.

URL: http://questionablecontent.net/view.php?comic=2901

Thursday, 19 February 2015

Victory Music - 03:27

Ads by Project Wonderful! Your ad could be here, right now.

URL: http://questionablecontent.net/view.php?comic=2900

Wednesday, 18 February 2015

It Had To Be Said - 03:01

Ads by Project Wonderful! Your ad could be here, right now.

I am pleased to announce that my friend Jedediah Berry has a new Kickstarter! It is for a story-in-cards designed to be shuffled before each reading, leading to a different experience each time. It is called "The Family Arcana" and you can check it out here!

URL: http://questionablecontent.net/view.php?comic=2899

Friday, 13 February 2015

Parley - 03:08

Ads by Project Wonderful! Your ad could be here, right now.

URL: http://questionablecontent.net/view.php?comic=2896

Thursday, 12 February 2015

Plan Of Action - 01:39

Ads by Project Wonderful! Your ad could be here, right now.

Always have a plan.

URL: http://questionablecontent.net/view.php?comic=2895

Wednesday, 11 February 2015

Eat That Foot - 04:49

Ads by Project Wonderful! Your ad could be here, right now.

URL: http://questionablecontent.net/view.php?comic=2894

Tuesday, 10 February 2015

Night Secrets - 01:58

Ads by Project Wonderful! Your ad could be here, right now.

been there, done that

URL: http://questionablecontent.net/view.php?comic=2893

Monday, 09 February 2015

Maybe Some Contacts - 03:34

Ads by Project Wonderful! Your ad could be here, right now.

URL: http://questionablecontent.net/view.php?comic=2892

Friday, 06 February 2015

You And Me - 04:17

Ads by Project Wonderful! Your ad could be here, right now.

URL: http://questionablecontent.net/view.php?comic=2891

Thursday, 05 February 2015

Comfort - 03:19

Ads by Project Wonderful! Your ad could be here, right now.

I have a new shirt for sale! It is about how living in the future means we have portable supercomputers in our pockets instead of jetpacks, and how great that is.

www.topatoco.com/qc

URL: http://questionablecontent.net/view.php?comic=2890

Wednesday, 04 February 2015

Never Again - 05:37

Ads by Project Wonderful! Your ad could be here, right now.

I have a new shirt for sale! It is about how living in the future means we have portable supercomputers in our pockets instead of jetpacks, and how great that is.

www.topatoco.com/qc

URL: http://questionablecontent.net/view.php?comic=2889

Tuesday, 03 February 2015

Uh Buh Buh Buh - 03:25

Ads by Project Wonderful! Your ad could be here, right now.

I have a new shirt for sale! It is about how living in the future means we have portable supercomputers in our pockets instead of jetpacks, and how great that is.

www.topatoco.com/qc

URL: http://questionablecontent.net/view.php?comic=2888

Monday, 02 February 2015

Hidey Holes - 03:09

Ads by Project Wonderful! Your ad could be here, right now.

URL: http://questionablecontent.net/view.php?comic=2887

Friday, 30 January 2015

Reassurance - 04:26

Ads by Project Wonderful! Your ad could be here, right now.

QC will be happy again someday I promise

URL: http://questionablecontent.net/view.php?comic=2886

Thursday, 29 January 2015

Gimme That Bear - 02:02