Ethical Algorithms

Bonus Episode 291 of the Cyberlaw Podcast

|The Volokh Conspiracy |

Algorithms are at the heart of the Big Data/machine learning/AI changes that are propelling computerized decision-making. In their book, The Ethical Algorithm, Michael Kearns and Aaron Roth, two Computer Science professors at Penn, flag some of the social and ethical choices these changes are forcing upon us. My interview with them touches on many of the hot-button issues surrounding algorithmic decision-making.

I have long suspected that much of the fuss over bias in machine learning is a way of smuggling racial and gender quotas and other academic social values into the algorithmic outputs. Michael and Aaron may not agree with that formulation, but the conversation provides a framework for testing it – and leaves me more skeptical about claims that "AI bias" is the problem it's been portrayed.

Less controversial, but equally fun, is our dive into the ways in which Big Data and algorithms defeat old-school anonymization – and the ways in which that problem can be solved. The cheating husbands of Philadelphia help me understand the value and technique of differential privacy.

And if you wondered why, say, much of the social science and nutrition research of the last 50 years doesn't hold up to scrutiny, blame Big Data and algorithms that reliably generate a significant correlation once in every 20 tries.

Michael and Aaron also take us  into the unexpected social costs of algorithmic optimization. It turns out that a recommendation engine that produces exactly what we want, even when we didn't know we wanted it, is great for the user, at least in the moment, but maybe not so great for society. In this regard, it's a little like creating markets in areas once governed by social norms. The switch to market pricing instead of societal mores often optimizes individual choice but at considerable social cost. It turns out that algorithms can do the same – optimize individual gratification in the moment while roiling our social and political order in unpredictable ways. We would react badly to a proposal that dating choices be turned into more efficient microeconomic transactions (otherwise known as prostitution) but we don't feel the same way about reducing them to algorithms.

Maybe we should.

Download the 291st Episode (mp3).

You can subscribe to The Cyberlaw Podcast using iTunes, Google Play, Spotify, Pocket Casts, or our RSS feed!

As always, The Cyberlaw Podcast is open to feedback. Be sure to engage with @stewartbaker on Twitter. Send your questions, comments, and suggestions for topics or interviewees to CyberlawPodcast@steptoe.com. Remember: If your suggested guest appears on the show, we will send you a highly coveted Cyberlaw Podcast mug!

The views expressed in this podcast are those of the speakers and do not reflect the opinions of the speakers' families, friends, a growing number of former friends, clients, or institutions. Or spouses.  I've been instructed to specifically mention spouses.

NEXT: Can the House Impeach a Former President?

Editor's Note: We invite comments and request that they be civil and on-topic. We do not moderate or assume any responsibility for comments, which are owned by the readers who post them. Comments do not represent the views of Reason.com or Reason Foundation. We reserve the right to delete any comment for any reason at any time. Report abuses.

  1. We would react badly to a proposal that dating choices be turned into more efficient microeconomic transactions (otherwise known as prostitution) but we don’t feel the same way about reducing them to algorithms.

    Nah. People are okay with that, they just don’t like codifying what’s going on. People like their illusion.

    1. Well, aside from the fact that prostitution is not dating, and “dating algorithms” are not transactions between the two “daters”, the common almost has meaning.

    2. People like their illusion

      Not only do they like it, they act on it. It’s much harder for me to feel good about saying to – say a minority potential tenant originally from a doubtful zip code – sorry, bud, don’t know anything about you personally but my feeling about your background is that you’re more likely to be a bad risk than I’m comfortable with. (Assuming it was legal.) Than to say sorry bud, my computer algorithm has rejected you.

      I don’t just mean in the sense of an uncomfortable social interaction, but in the sense of my own self worth.

      That’s why impersonal corporations, and governments, are capable of being ruthless more consistently than small businesses which are the immediate extensions of a particular person or a small group of humans. Even though the corporations are peopled by actual humans. Ditto the government.

      1. Of course, not all small businesses share your feelings. Some of them are quite happy to toss deserving people aside if it means making an extra 3 cents on the transaction. I don’t share your inherent suspicion of corporations or government, by which I mean bureaucracy. The trick is to fine-tune the algorithms used by the bureaucracy to take into account all the relevant factors.

        1. You are incorrect.

          The point is that if you – say – give a job to an underqualified minority hire, because you think it’s fair that he should be given a chance, when you could just as well have given the job to a well qualified non minority hire, you may be being “fair” to the hire who has been given a leg up (for some values of “fair”) but you are cheating the business owner. If you are the business owner, that’s OK. Nothng wrong with cheating yourself to soothe your conscience. But if you are not the business owner, you’re abusing a position of trust.

          Corporations require a sense of loyalty to someone and something other than your own self – the business. If you indulge your own feelings at the expense of the business, you are in substance embezzling company funds. People understand the difference between compassion deployed at their own personal cost, and embezzling other folks funds to finance their compassion.

          1. “The point is that if you – say – give a job to an underqualified minority hire, because you think it’s fair that he should be given a chance, when you could just as well have given the job to a well qualified non minority hire, you may be being “fair” to the hire who has been given a leg up (for some values of “fair”) but you are cheating the business owner. If you are the business owner, that’s OK. Nothng wrong with cheating yourself to soothe your conscience. But if you are not the business owner, you’re abusing a position of trust.”

            This is a pretty significant tangent from the point I actually made, which is that while some small business owners might give a job to a potentially underqualified candidate, because they’re not “consistently ruthless”, some small business owners are, indeed, consistently ruthless.

            Now, in a bureaucracy (of any kind), the point is that people don’t make decisions… the decisions have already been made, and the bureaucrats just apply them. If you adjust what the prefabricated decisions are, you get more or less “ruthless” decisions. The point of well-built bureaucracy isn’t ruthlessness, it’s consistency. If the decision you want is ruthless, then a bureaucracy will be consistently ruthless. If the decision you want is NOT ruthless, tweak it, and the bureaucracy will be consistently non-ruthless.

            “Corporations require a sense of loyalty to someone and something other than your own self – the business.”

            Another tangent. The question isn’t whether you’re loyal to the corporation, the question is whether you look to the corporations long-term goals, or stay solely in the short-term.

            Example: Customer service. Good customer service costs. But it also pays, in customer loyalty. Do you focus on the short term (screw the customer, we already have his money) or on the long-term (serve the customer, and keep getting his money on into the future)?

  2. re: “algorithms that reliably generate a significant correlation once in every 20 tries”

    Always worth repeating. https://www.xkcd.com/882/

  3. “fuss over bias in machine learning is a way of smuggling racial and gender quotas”

    This is an awfully conspiratorial way of looking at things. I am not going to lie, I think it illustrates severely problematic thinking to prioritize your “political concerns” over actually thinking through an issue and taking it seriously. It shows that politics is dominating over intellectual ability within this context.

    The reason we should be concerned about machine learning bias is because it can pick up correlations that are completely unfair from the standpoint of MLK’s idea of judging a person by the “the content of their character.”

    Take zip code for example. Perhaps merely living in a certain zip code is correlated with (1) better employment outcomes and being (2) less likely to default on a loan.

    A machine learning algorithm may, in the absence of other information, pick up on such a correlation.

    But should a person be denied a job because of the zip code they were raised in? Or credit for that same reason? That certainly is quite contrary to trying to judge a person on their merit or giving them a fair shot to prove themselves.

    Concerns about bias are not merely “quotas.” A person who is denied a job because they were raised in the wrong zip code may in fact end up being a worse credit risk, but PRECISELY because they were denied a job based on zip code. All of the sudden, a fact that ought to be irrelevant (namely zip code), will become an important input into whether a person even has a fair opportunity to reach their potential. And if anything, such blatant unfairness will increase the case for quotas. After all, there can only be one principled objection to quotas. That they are unfair. But, if everything else is systematically unfair, the unfairness of quotas is less problematic.

    By not taking the problem of bias seriously, Stewart actually makes the solution he wants to avoid more likely. More generally, you can’t claim to believe in meritocracy or the allocation of opportunities based on merit if you don’t take fairness seriously.

    1. Fairness defined by whom? That’s the crux of libertarianism. Once you let politicians define these things, they become political footballs, the “solutions” change with every election, and the “cures” are worse than the disease.

      Much much better to let bigots be bigots for the world to see and individually boycott. That way the bigots see the real world consequences of their bigotry, and have to either shape up and stop being bigots, or remain bigots and lose business.

      “Fairness” is far too subjective to be the subject of laws.

      1. Much much better to let bigots be bigots for the world to see and individually boycott.

        Bullshit.

        You are simply not addressing David’s point, and letting your biases cause you to rationalize throwing up your hands and giving up.

        The problem with the sort of problem he cites is that the algorithm may not, probably doesn’t, have any moral sense. It never makes tradeoffs that sacrifice economic efficiency in the interest of equity.

        1. “It never makes tradeoffs that sacrifice economic efficiency in the interest of equity.”

          Right, it doesn’t racially discriminate in order to achieve quotas.

        2. Who gets to define “equity”? It’s subjective, whether you call it “fairness” or “equity”. Someone has to decide whether the algorithm is fair or equitable, and once again, you want it to be government, as run by you and your side, because only you know what is fair or equitable.

          1. The point being that an algorithm can be fair to individuals, but not to collective society, or vice versa. When a person does this, we can appeal to their morality. An algorithm, of course, has no morality. (The response to this is that whoever creates the algorithm does, and we can appeal to their morality to alter the algorithm to correct the deficiencies.)

            1. “The point being that an algorithm can be fair to individuals, but not to collective society, or vice versa.”

              No. Simply no. Fairness to individuals is the only real fairness. You can’t build a fair society out of individual acts of injustice.

              People start talking about “social justice” because they want an excuse to commit individual injustices, because there’s something they value more than justice. Not because there’s some fictional collective justice you can construct out of injustice.

              1. Fairness to individuals is the only real fairness

                I agree there’s a serious measurement problem with the idea of “fairness to collective society” since there’s no canonical method of measuring the trade off of fairness to A against fairness to B.

                Fairness to individuals does at least have the merit of avoiding this sort of measurement problem, since the opinion of each individual can reasonably be regarded as canonical in respect of himelf.

                But there is still no way to reconcile my opinion of what is fair to me, with your opinion of what is fair to you. Of course we can demand equality of fairness in the sense of subjective consistency – that I should accept that what is fair for me is fair for you. And vice versa. But that still doesn’t guarantee that we will each finish up with the same idea of what is fair.

                It’s still irretrievably muddy since the whole concept of “fairness” is subjective.

                1. “I agree there’s a serious measurement problem with the idea of “fairness to collective society” since there’s no canonical method of measuring the trade off of fairness to A against fairness to B.”

                  Depends on what kind of situation you’d like to examine. While some are indeed difficult or impossible to measure accurately, some can be measured quite effectively. Traffic engineering depends on accurate measurement. There are complicating factors (is there a measurable benefit to an individual if an ambulance can get to the hospital in 5 minutes instead of 45? Does not being dead of stroke count as a measurable benefit? Whereas the benefits of truckloads of merchant goods able to traverse an urban area in 55 minutes instead of 65 seem pretty trivial, unless you’re the shipping company paying wages to the drivers of all those trucks. But at a personal level, being able to get home from work in 25 minutes instead of 30 is priceless.

              2. “No. Simply no. Fairness to individuals is the only real fairness.”

                You’re an idiot if you actually believe this.

                An uncontrolled intersection provides individual fairness. Each person comes to the intersection and decides for him or herself if it’s safe to proceed.
                A stoplight provides collective fairness. Sometimes it’s individually unfair… a person who is the only motorist at the intersection has to stop and wait for the light to change.
                So, why do we have stoplights? Because it’s safer AND faster, on average, than leaving all the intersections uncontrolled.

                Don’t like traffic lights? OK. Paying taxes to keep a fire brigade can also be unfair to individuals (no fire in your house? Pay the tax anyway) but benefits, collectively… you’re ahead if they keep the fire in your neighbor’s house from spreading to yours.

        3. “The problem with the sort of problem he cites is that the algorithm may not, probably doesn’t, have any moral sense.”

          Using the example David had, what’s with “may not”? It definitely doesn’t. It’s just trying to figure out credit risk correlation based on available data. The answer to that question is of pretty significant interest to people who make loans. The fact that the algorithm leads to answers you (or I) find morally reprehensible is of no moment.

          I assume there are some banks that would “sacrifice” your “interest of equity” (however defined) over “economic efficiency”. They should be free to make that trade off.

          1. The problem is, the fact that two of my neighbors defaulted on their loans doesn’t actually tell you anything about whether or not my loan is sound.

            1. The problem is that, actually, it does. Not as much as a detailed investigation into your personal circumstances, of course, but finding out if two of your neighbors defaulted is a LOT cheaper to find out.

              1. “The problem is that, actually, it does.”

                The problem is that, no, it doesn’t… unless a sinkhole or a volcano or something has recently destroyed all three properties, in which case the fact that a volcano recently destroyed my property is probably the takeaway, not that it got two of my neighbors, as well.

            2. The discussion above assumes that the likelihood of you defaulting on your loan does correlate with the likelihood of your neighbors defaulting on their loans. That’s why David Welker posited:

              “Take zip code for example. Perhaps merely living in a certain zip code is correlated with (1) better employment outcomes and being (2) less likely to default on a loan.”

              Correlation does not show causation, but to a lender correlation it doesn’t matter. If the data is only robust enough to demonstrate correlation rather than strict causation, the bank is still going to rely on the correlative data. That’s why it is so important to support robust data, since additional inputs are the only chance we have of eventually narrowing it down to strict causation.

              1. “The discussion above assumes that the likelihood of you defaulting on your loan does correlate with the likelihood of your neighbors defaulting on their loans”

                But, there’s no way to use that predictively. You don’t know the odds that I’m going to default on my loan until they are either 0% or 100%. Other people in my zip code might be defaulting left and right, but being in the same zip code isn’t causing it.

                1. While it’s true that a mere statistical correlation like that isn’t actually predictive in any individual case, if there really is a correlation due to something that’s ongoing, it will be predictive on a statistical basis for large groups. That’s what correlation MEANS.

                  Now, undoubtedly businesses could do better in terms of accuracy by hiring somebody to do a detailed investigation of your individual circumstances, and they’d actually do that if a really large sum was at stake. But for smaller transactions the cost of doing this would swallow any potential profit.

    2. You’re conflating a few things that’s I’ve often seen conflated with machine learning compared to algebraic models, which might change your mind (and if not, better understanding is good regardless).

      A machine learning model, by definition, cannot be biased (assuming it wasn’t literally programmed to be, in which case it’s not the machine learning that’s doing it). Machine learning works by finding combinations of things that have predictive value, without knowing what any of those things are. While we typically have things encoded into a human-friendly format – with distinct things like age, gender, and race – the machine learning model doesn’t see any of those. Instead (and using a neural network as an example) machine learning only sees undifferentiated bits – just a long string of zeros and ones. A typical example might look like 00001011010100101, and what the network does is determine which combinations have predictive power by testing every combination individually and with others. In application that would mean that something like race gets coded to look just like hair color – both are some string of zeros and ones that look like 0001, 01101, 11101, etc.

      The algorithm doesn’t know what any of these mean to a human, and so it cannot be biased.

      What it can do, however, is uncover correlations that are unintuitive, or that for some other policy reason we don’t want used, but these restrictions can only make the models less accurate, while keeping in mind that as Yogi Berra said, “predictions are hard, especially about the future.”

      But those unintuitive correlations may have an underlying meaning we think is appropriate to use. The Nigerian Prince scam, for example, is still in use precisely because it looks like a scam – the scammers are trying to get people to self identify as a sucker so they don’t waste time on someone they won’t be able to convince, so they solicit people in a way to detect suckers – anyone who responds is effectively holding their hand in the air as saying “please steal my money.” The same applies to things like zip codes (and this is speculation, but it’s informed speculation based on health research, which is my field). A person who chooses to live in a particular zip code is self identifying certain likely characteristics, which might include: I work nearby, I like (specific ethnic food based on local businesses), I’m a bigot who doesn’t want (specific ethnicity who is rare in that zip code), I’m a drug dealer and this is where my customers live, I care about education (if local schools are especially good), or a plethora of other information. Each of these may have predictive power, and a decent machine learning algorithm will piece them apart – some zip codes may be useful in identifying better or worse credit risks, while some zip codes aren’t. Machine learning (if competently implemented) will also update itself, so that as these factors change over time the algorithm changes too. It’s not just a matter of changing how things are weighted, but whether they’re included at all.

      But these aren’t biases, they are information that is useful in predicting the future. We may think it’s a bad public policy to use them, but that’s a different argument, and one that gets conflated far too often to be useful as people may agree it’s improper to use something while also thinking it’s more accurate, while their opponent tries to convince them it’s not accurate, even though they agree on the actual facts because they’re talking past each other.

    3. “The reason we should be concerned about machine learning bias is because it can pick up correlations that are completely unfair from the standpoint of MLK’s idea of judging a person by the “the content of their character.””

      Which is just another way of saying that you’re demanding that racial quotas be smuggled into the algorithms.

      It’s not enough that you don’t want the algorithm to take into account race. That’s easy: You don’t tell it race, it doesn’t take it into account.

      No, you don’t want the algorithm to take into account perfectly legitimate things if doing so produces results that correlate with race.

      But the races are not, typically, identically situated, which means that producing results that don’t correlate with race actually requires taking race into account, only doing so in a way that reduces, not increases, the predictive value of the model.

      Like, you can take a criminal record into account in determining how likely a possible hire is to steal from the company, unless the file is flagged “black”, in which case you apply a factor to reduce the weight of the criminal record for no reason but to make your race statistics look good.

      You’re trying to force the algorithms to copy what Harvard is doing: Avoiding too many Asian-Americans from being admitted by using the fact that somebody is Asian-American to determine that they have a “bad personality”.

      1. The reason we try to force government to be colorblind is because it’s unfair, not that there are invalid correlations.

        Algorithms sussing these out just re-emphasizes this, which is something we should not forget. It is a conscious and deliberate choice to do this for fairness.

        1. I thought the reason we try to force government to be colorblind is because it’s actually demanded by the 14th amendment, in addition to being unfair to treat people according to their color.

          The basic problem here with the concept of “ethical algorithms”, is that groups are not typically situated the same. Men are on average stronger than women, blacks have higher crime rates than whites, Asian-Americans are more studious than Latinos, and so forth.

          So if your algorithm really does judge people on the content of their character, not the color of their skin, it’s NOT going to meet the quotas, it’s going to produce “disparate impact”.

          And what a lot of people mean by “ethical algorithms” is algorithms that create outcomes that DON’T produce disparate impact, that hire the right percentages regardless of merit, admit the right proportion of ethnic groups.

          They really are demanding that the algorithms implement the quotas.

          1. So if your algorithm really does judge people on the content of their character, not the color of their skin, it’s NOT going to meet the quotas, it’s going to produce “disparate impact”.

            Right, and just like disparate impact lawsuits, while the impact might be sufficient to start a lawsuit/investigation, the disparate impact itself is not sufficient to prove there’s a problem.

            UPS hires more men then women to be truck drivers. Clear case of disparate impact. But when we look at it, we see it’s because (A) the candidate pool is disproportional to men, and (B) more men meet the strength requirements.

            So we control for those variables (which are legitimate and acceptable explanations) and see if the disparate impact disappears. If it mostly does, we’re golden. If it doesn’t, we’re stuck looking for what are the explanations, and desperately holding it’s not “because sexism” (which would be illegitimate and unacceptable).

            Same thing works with machine learning algorithms. We use disparate impact to identify possible problems, but we don’t know that we have an actual problem until we start controlling for variables to tease out what’s going on. The problem is, us idiot programmers have unintentionally made it obscenely difficult to validate this shit and figure out what they’re actually considering.

            Bottom-line? Yes, disparate impact is a big part in what triggers an investigation. But while it is necessary, it is not sufficient.

            1. And just like the disparate impact lawsuits, the only way to avoid the cost of defending against the lawsuits, and the risk of getting the wrong jury, is to implement the quota.

          2. “I thought the reason we try to force government to be colorblind is because it’s actually demanded by the 14th amendmen”

            And the reason we put it into the 14th amendment is because…

            This isn’t theology, where we start by assuming our text is inerrant and end the inquiry into its meaning there.

      2. “Which is just another way of saying that you’re demanding that racial quotas be smuggled into the algorithms.”

        No. Just the opposite, we’re demanding that racial quotas be smuggled OUT of algorithms. Because algorithms can find ways to do racial quotas without (directly) referencing race. They find things that correspond to race with high confidence.

        1. It’s easy to keep the algorithm from doing racial quotas. Just don’t tell it anybody’s race, and it can’t seek out things that correspond to race, because it doesn’t have any racial information.

          It may find things that correlate with race, but only because they’re valid predictors, not because they’re race proxies.

          1. “It’s easy to keep the algorithm from doing racial quotas. Just don’t tell it anybody’s race, and it can’t seek out things that correspond to race, because it doesn’t have any racial information.”

            Ah, so you’re not taking this seriously enough to come up with an actual answer?

            1. No, you’re not taking this seriously to engage.

              Look, say you’re creating an AI model for admissions to Harvard. You’re tasked with maximizing academic performance in the incoming class, and given no information at all about the race of applicants. You’re going to be radically over-representing Asian Americans, and under-admitting blacks and latinos, because these groups aren’t similarly situated in terms of academic performance.

              Does this mean you’re implementing a racial quota by proxy? No, it means you’re NOT implementing a racial quota.

              In a world where groups are not similarly situated in terms of valid measures of merit, the only way to not produce “disparate impact” is to actively discriminate. In the real world we live in, producing perfect race numbers isn’t proof you’re not discriminating, it’s proof you ARE discriminating!

    4. “But should a person be denied a job because of the zip code they were raised in? Or credit for that same reason? That certainly is quite contrary to trying to judge a person on their merit or giving them a fair shot to prove themselves.”

      The denial will be based on all the correlations, not just the one (zip code) unless that one piece of data is so predictive that nothing else matters. And if that is the case, what possible benefit is there to giving credit or jobs to people who, robust data shows, cannot keep the job or satisfy the credit risk?

      There a lot of ways that someone with, e.g., a bad zip code, could bolster their prospects for employment or credit in other ways. The best way to avoid meritorious candidates who are getting overlooked because they have the wrong race or zip code, is reliance on as robust data as possible. We want banks issuing loans by relying on data (rather than anecdotes or human bias) that predicts for credit worthiness.

      1. “The denial will be based on all the correlations, not just the one (zip code) unless that one piece of data is so predictive that nothing else matters”

        Correlation is not causation. A person can have perfect credit history, and then have something happen that destroys his ability to meet his obligations. Similarly, a person can have every card in the deck stacked against him, and still pay all his debts. There’s no such thing as “so predictive that nothing else matters”, except maybe “this person is currently dead”.

        1. “Correlation is not causation.”

          So what? We’re talking about evaluating future default risk. Of course there’s no data available to the bank (in the present) that will show the causal failure (in the future). As you said, a person can have a perfect credit history and then become a heroine addict. So the bank wants to find out things that correlate with “become a heroine addict”.

          “There’s no such thing as “so predictive that nothing else matters”,…”

          I agree! The problem is that if you systematically exclude data sets for “ethical” reasons, banks are going to be less likely to identify the actual causative as opposed to correlative factors to default. Without reliable data inputs, the alternative is not “everyone in every zip code gets a loan!” They just stop issuing unsecured loans.

          1. heroine addict

            I’m a bit of a heroine addict myself.

            This is one of my favorites :

            https://en.wikipedia.org/wiki/Andr%C3%A9e_de_Jongh

        2. Correlation is not causation.

          A very tired and irrelevant meme. if we were not discussing humans, but dogs or turnips or engine components, not a person on the planet would doubt the rationality, and benefits, of using correlations to make predictions. Hence raising it when we’re talking about humans is simply an attempt to deflect from the actual point at issue. Which is not whether predictions using correlations are rational, which they obviously are, but whether they are unfair to some humans, despite being rational.

          Since no one cares about fairness to turnips, the “correlation is not causation” cliche never gets trotted out for turnips. Trotting it out for humans is just hand waving.

          1. “A very tired and irrelevant meme.”

            That happens to be directly on-point and relevant, as much as you don’t like it.

            “if we were not discussing humans, but dogs or turnips or engine components, not a person on the planet would doubt the rationality, and benefits, of using correlations to make predictions.”

            And if pigs could fly, nobody would go outside without an umbrella.

            ” Which is not whether predictions using correlations are rational, which they obviously are”

            For some reason, you left out the word “not” which belongs at this point in the sentence.

            Here’s a correlation. 100% of convicted rapists breathe oxygen. With this correlation in hand, obviously we want to go down to the hospital, and arrest everyone on oxygen therapy.

            1. A measure which 100% correlates with trait “A” AND with not trait “A” isn’t predictive of trait “A”, it’s totally useless, and would not originate out of any normal statistical analysis. Statistical analysis tries to find measures that help you distinguish “A” from not “A”.

              You’re either remarkably ignorant of this whole topic, or arguing in bad faith.

  4. What this really gets down to is that ethical sense and fiscal sense don’t always align.

    If your algorithm is driven purely by the later, it will only produce ethical results due to coincidences. If you want it to produce ethical results, you have to better define the ethical parameters first so that it knows what it’s solving for.

    1. The problem being that you can optimize the algorithm to produce the best results for individuals, or you can optimize the algorithm to produce the best results collectively, but you can’t do both at the same time.

      As a greatly simplified example, consider the timings for stoplights. Nobody likes being stuck behind a red light, but being stuck behind reds is the tradeoff we make for sometimes having green and not having to worry so much about the traffic from the other directions in the intersection possibly smashing into the side of your vehicle.

  5. “And if you wondered why, say, much of the social science and nutrition research of the last 50 years doesn’t hold up to scrutiny, blame Big Data and algorithms that reliably generate a significant correlation once in every 20 tries.”

    And that, my friend, is why models for climate and climate change are essentially worthless. Those failed 19/20 algorithms that don’t match the data are never published, which gives the illusion that the remaining 5% you hear about are robust. Throw in a big dose of academic politics, confirmation bias, and an inability to do experiments on meaningful timescale, and it’s easy to see how an entire field can fool itself.

Please to post comments

Comments are closed.