Machines could soon be able to understand and summarize text for you

We humans are swamped with text. It’s not just news and other timely information: Regular people are drowning in legal documents. The problem is so bad we mostly ignore it. Every time a person uses a store’s loyalty rewards card or connects to an online service, his or her activities are governed by the equivalent of hundreds of pages of legalese. Most people pay no attention to these massive documents, often labeled “terms of service,” “user agreement” or “privacy policy.”

These are just part of a much wider societal problem of information overload. There is so much data stored – exabytes of it, as much stored as has ever been spoken by people in all of human history – that it’s humanly impossible to read and interpret everything. Often, we narrow down our pool of information by choosing particular topics or issues to pay attention to. But it’s important to actually know the meaning and contents of the legal documents that govern how our data is stored and who can see it.

As computer science researchers, we are working on ways artificial intelligence algorithms could digest these massive texts and extract their meaning, presenting it in terms regular people can understand.

Can computers understand text?

Computers store data as 0’s and 1’s – data that cannot be directly understood by humans. They interpret these data as instructions for displaying text, sound, images or videos that are meaningful to people. But can computers actually understand the language, not only presenting the words but also their meaning?

One way to find out is to ask computers to summarize their knowledge in ways that people can understand and find useful. It would be best if AI systems could process text quickly enough to help people make decisions as they are needed – for example, when you’re signing up for a new online service and are asked to agree with the site’s privacy policy.

What if a computerized assistant could digest all that legal jargon in a few seconds and highlight key points? Perhaps a user could even tell the automated assistant to pay particular attention to certain issues, like when an email address is shared, or whether search engines can index personal posts. Companies could use this capability, too, to analyze contracts or other lengthy documents.

To do this sort of work, we need to combine a range of AI technologies, including machine learning algorithms that take in large amounts of data and independently identify connections among them; knowledge representation techniques to express and interpret facts and rules about the world; speech recognition systems to convert spoken language to text; and human language comprehension programs that process the text and its context to determine what the user is telling the system to do.

Examining privacy policies

A modern internet-enabled life today more or less requires trusting for-profit companies with private information (like physical and email addresses, credit card numbers and bank account details) and personal data (photos and videos, email messages and location information).

These companies’ cloud-based systems typically keep multiple copies of users’ data as part of backup plans to prevent service outages. That means there are more potential targets – each data center must be securely protected both physically and electronically. Of course, internet companies recognize customers’ concerns and employ security teams to protect users’ data. But the specific and detailed legal obligations they undertake to do that are found in their impenetrable privacy policies. No regular human – and perhaps even no single attorney – can truly understand them.

In our study, we ask computers to summarize the terms and conditions regular users say they agree to when they click “Accept” or “Agree” buttons for online services. We downloaded the publicly available privacy policies of various internet companies, including Amazon AWS, Facebook, Google, HP, Oracle, PayPal, Salesforce, Snapchat, Twitter and WhatsApp.

Summarizing meaning

Our software examines the text and uses information extraction techniques to identify key information specifying the legal rights, obligations and prohibitions identified in the document. It also uses linguistic analysis to identify whether each rule applies to the service provider, the user or a third-party entity, such as advertisers and marketing companies. Then it presents that information in clear, direct, human-readable statements.

For example, our system identified one aspect of Amazon’s privacy policy as telling a user, “You can choose not to provide certain information, but then you might not be able to take advantage of many of our features.” Another aspect of that policy was described as “We may also collect technical information to help us identify your device for fraud prevention and diagnostic purposes.”

We also found, with the help of the summarizing system, that privacy policies often include rules for third parties – companies that aren’t the service provider or the user – that people might not even know are involved in data storage and retrieval.

The largest number of rules in privacy policies – 43 percent – apply to the company providing the service. Just under a quarter of the rules – 24 percent – create obligations for users and customers. The rest of the rules govern behavior by third-party services or corporate partners, or could not be categorized by our system.

The next time you click the “I Agree” button, be aware that you may be agreeing to share your data with other hidden companies who will be analyzing it.

We are continuing to improve our ability to succinctly and accurately summarize complex privacy policy documents in ways that people can understand and use to access the risks associated with using a service.

Source: World Economic Forum

The Fasctinating Facts Behind the Creation of Fictional Languages

In these 2+1 videos (the +1 will be a surprise at the end of this post) you can take a deeper look into the process how fictional/fantasy languages can be created.

Like almost all studies and articles related to this topic, we must start with the grandfather of all these language inventing methods, J.R.R. Tolkien. As explained in the following video, J.R.R. Tolkien was very consequent; being a linguist he knew all the important features of human languages in general, and took these into account. The video explains it very interestingly and thoroughly from the vocabulary to grammar rules, not forgetting the geographical and diachronic aspects that effect every language in our world – and in any fantasy world. It also analyses the Navi’ language (the one the main characters speak in Avatar), Star Trek’s Klingon and Dothraki spoken in Game of Thrones.


But if we go further, we must see that if J.R.R. Tolkien have been the grandfather of the idea of inventing fantasy languages, it must mean, his “children” and “grandchildren” developed his methods and invented new ones, bringing in new points of views, and so on. Accordingly, on the below video you can learn more about how many aspects you have to consider to invent a language. For example, you have to know the people using it, these people’s habits, their origins – even if they are only fictional! (Furthermore, the video is also about the communities of non-fictional people using these languages enthusiastically.)


And the surprise we promised: J.R.R. Tolkien speaking Namárië language fluently:


Additionally, we should certainly not forget that all of these amazing people put loads of effort to reach the level of being able to construct fictional languages. Mark Okrand, who invented Star Trek’s Klingon, has a PhD in linguistics from Berkeley, and Paul Frommer, the creator of Na’vi, is professor emeritus of clinical management communication at the University of Southern California.

Source: Noemi, author at Babelproject

A Harvard linguist reveals the most misused words in English

Some languages, like French, have an official body that decides how words can and cannot be used.

English, as a flexible, global language, has no such designated referee.

Therefore, there is no definitive answer to whether you’re using a word “correctly.”

It’s all a matter of taste and context. But there are opinions. And some count more than others.

Steven Pinker is probably as good an expert to ask as anyone. Helpfully, the renowned Harvard linguist and best-selling author recently wrote a book, titled “The Sense of Style,” that aims to help readers improve their use of the English language.

If you’re in the market for an update to, old Strunk and White, it’s probably a good buy. But if you just want to spot-check that you’ve not been making embarrassing language mistakes for years, a monster list of 58 commonly misused phrases covered in the book that recently appeared in the UK’s Independent newspaper is probably a good place to start.

Here are some highlights:

  • Adverse means “detrimental.” It does not mean “averse” or “disinclined.” Correct: “There were adverse effects.” / “I’m not averse to doing that.”
  • Appraise means to “ascertain the value of.” It does not mean to “apprise” or to “inform.” Correct: “I appraised the jewels.” / “I apprised him of the situation.”
  • Beg the question means that a statement assumes the truth of what it should be proving; it does not mean to “raise the question.” Correct: “When I asked the dealer why I should pay more for the German car, he said I would be getting ‘German quality,’ but that just begs the question.”
  • Bemused means “bewildered.” It does not mean “amused.” Correct: “The unnecessarily complex plot left me bemused.” / “The silly comedy amused me.”
  • Cliché is a noun, not an adjective. The adjective is clichéd. Correct: “Shakespeare used a lot of clichés.” / “The plot was so clichéd.”
  • Data is a plural count noun not, standardly speaking, a mass noun. [Note: “Data is rarely used as a plural today, just as candelabra and agenda long ago ceased to be plurals,” Pinker writes. “But I still like it.”] Correct: “This datum supports the theory, but many of the other data refute it.”
  • Depreciate means to “decrease in value.” It does not mean to “deprecate” or to “disparage.” Correct: “My car has depreciated a lot over the years.” / “She deprecated his efforts.”
  • Disinterested means “unbiased.” It does not mean “uninterested.” Correct: “The dispute should be resolved by a disinterested judge.” / “Why are you so uninterested in my story?”
  • Enormity refers to extreme evil. It does not mean “enormousness.” [Note: It is acceptable to use it to mean a deplorable enormousness.] Correct: “The enormity of the terrorist bombing brought bystanders to tears.” / “The enormousness of the homework assignment required several hours of work.”
  • Hone means to “sharpen.” It does not mean to “home in on” or “to converge upon.” Correct: “She honed her writing skills.” / “We’re homing in on a solution.”
  • Hung means “suspended.” It does not mean “suspended from the neck until dead.” Correct: “I hung the picture on my wall.” / “The prisoner was hanged.”
  • Ironic means “uncannily incongruent.” It does not mean “inconvenient” or “unfortunate.” Correct: “It was ironic that I forgot my textbook on human memory.” / “It was unfortunate that I forgot my textbook the night before the quiz.”
  • Nonplussed means “stunned” or “bewildered.” It does not mean “bored” or “unimpressed.” Correct: “The market crash left the experts nonplussed.” / “His market pitch left the investors unimpressed.”
  • Parameter refers to a variable. It not mean “boundary condition” or “limit.” Correct: “The forecast is based on parameters like inflation and interest rates.” / “We need to work within budgetary limits.”
  • Phenomena is a plural count noun — not a mass noun. Correct: “The phenomenon was intriguing, but it was only one of many phenomena gathered by the telescope.”
  • Shrunk, sprung, stunk, and sunk are past participles–not words in the past tense. Correct: “I’ve shrunk my shirt.” / “I shrank my shirt.”
  • Simplistic means “naively or overly simple.” It does not mean “simple” or “pleasingly simple.” Correct: “His simplistic answer suggested he wasn’t familiar with the material.” / “She liked the chair’s simple look.”
  • Verbal means “in linguistic form.” It does not mean “oral” or “spoken.” Correct: “Visual memories last longer than verbal ones.”
  • Effect means “influence”; to effect means “to put into effect”; to affect means either “to influence” or “to fake.” Correct: “They had a big effect on my style.” / “The law effected changes at the school.” / “They affected my style.” / “He affected an air of sophistication to impress her parents.”
  • Lie (intransitive: lies, lay, has lain) means to “recline”; lay (transitive: lays, laid, has laid) means to “set down”; lie (intransitive: lies, lied, has lied) means to “fib.” Correct: “He lies on the couch all day.” / “He lays a book upon the table.” / “He lies about what he does.”

It should be noted that while it’s always good to polish up your writing, one of the joys of language is that it isn’t fixed in time. It evolves. Nor is there a single “correct” style (in English, at least).

You’d neither connect nor impress if you chose your words like an Oxford don at a rap battle (though, actually, someone please make that YouTube video), and you’d be unlikely to get a job at an investment bank today speaking like Shakespeare. Why is this important? It’s easy to get too caught up in being perfectly “correct” and become a tedious language snob. Remember you probably want to come across as intelligent and thoughtful, not uptight and pedantic. So don’t get so worked up over the little things that you miss the larger point of good writing — to communicate clearly and engagingly with your chosen audience.

Source: Business Insider

Being fluent in 2 languages might literally change how you perceive time

Interesting article recently posted on www.mic.com

Being bilingual already has a long list of benefits. Research suggests that it boosts creativity and memory, strengthens multitasking and slows down the onset of dementia. But in case these benefits don’t already outweigh the monotony of memorizing grammar structures and vocabulary lists, here’s one more: Bilingualism seems to give us a more nuanced perception of time.

Scientists asked a group of Spanish-Swedish bilinguals to guess how much time passed after watching a container fill up with liquid or a line grow on a screen. When they asked the question using the word “duración” (spanish for “duration”), participants adjusted their time estimates according to the volume in the container, but not the length of the line on their screen. When scientists used the word “tid” (Swedish for “time”), estimates were shaped by how long the line grew, but not by how much the containers were filled.

Here’s why that’s cool: Despite our frenzied morning commutes or our 15-minute lunch breaks, the way time works is, in some ways, up to our culture and imagination.

“Language can creep into our perception and basically make us experience time in a very language-specific way,” Panos Athanasopoulos, a co-author on the study and a professor at Lancaster University in the U.K., said.

Athanasopoulos compared it to Arrival, a 2016 film about a linguist (played by Amy Adams) who tries to decipher an alien language. In the movie, the way the aliens talk about time gives them the superpower of seeing into the future — so, as Adams begins to understand their language, she also sees what’s next.

“Unlike Hollywood, we’re not claiming that bilinguals can see into the future, but learning a language really does rewire the brain,” Athanasopoulos said in a Skype interview. “Mentally going back and forth between different languages and ways of understanding time is actually brain training. You are sending your brain to the gym.”

Humans understand time spatially or in terms of quantities, but how that happens is largely up to the languages we speak. When discussing how long we rest, for example, Swedish and English speakers tend to say they took a “short” or “long” break while Greek and Spanish speakers will say they took a “small” or “big” one. English speakers say that the future “is ahead of us” and the past “is behind us,” but Aymara speakers say the opposite; the past is in front of us because it’s something we’ve already seen and experienced, whereas the future is still a mystery and therefore is behind us.

“Basically, [bilingualism] makes you aware that there are different perspectives out there and it makes you more flexible in adopting those perspectives,” Athanasopoulos said. A second language literally gives the brain more neural pathways (or connections) to work with.

Not too long ago, it was somewhat unpopular to think that language influences the way we see the world. In the 1960s and 1970s, American linguist Noam Chomsky’s theory that there’s a “universal grammar” among the world’s roughly 7,000 languages rose to popularity. Some started to believe that most languages were the same, barring small differences, and therefore language could not change the way we think. But decades of research couldn’t confirm it.

“The universalism idea kind of took it a little bit too far,” Athanasopoulos said.

So, as science tweaks the way we understand linguistics and our brains, we realize that we’re capable of seeing the world in new ways. All you have to do is learn a second language.

Author: Kelly Kasulis

Targeting German speakers in Germany, Austria and Switzerland

There are some differences in Standard German as used in Germany, Austria and Switzerland, mostly in style and syntax, but other than cultural specifics, much of the vocabulary is the same. In translating general topics, there is normally no need to cater individual variants. But marketing, social media or other direct types of content may need to be adopted for the different variants.

In general, terms related to finance, sports, culture or administration may need to be verified if they can be used in all 3 countries.

German is official language in Germany, Austria, Switzerland, Belgium, Luxembourg, Lichtenstein.

2in1: Translation and layouting in one project

Preparing the translation directly in the publication is getting more and more popular. Not only because the translators see better the context of the text they are working on, but also because it`s much more cost efficient than preparing the translation first and then having the designer team to do the layout. Designers can also make errors during copy-pasting the translation into the artwork. If companies use a professional translation agency, the text will be reviewed by a professional linguist, and by this, errors in the final publication can be avoided.

Eurideas Language Experts offers translation and layouting services to its clients. We can translate directly into the InDesign format, and we only need to do a minimal typesetting of the final publication (to fix the text boxes, titles in case of ‘longer’ languages).

Our professional editor team can also assist our clients with preparing the publication from scratch. If you have a report or study to publish, but you have limited budget for outsourcing the layouting to a design agency is not an option, we can help you.