Disruptive Technologies - When does Software write Software?

Disruptive Technologies - When does Software write Software?

How Smart Technologies affect Software Development – a Perspective

Every­thing gets “smart”, the trends of the dig­i­tal busi­ness like the In­ter­net of things, smart cities, in­tel­li­gent traf­fic con­trol or smart fac­to­ries – all pre­sup­poses in­tel­li­gent ma­chines. The knowl­edge of a ma­chine is its soft­ware. Soft­ware is the link be­tween all par­tic­i­pants of our in fu­ture in­creas­ingly in­ter­con­nected and in­tel­li­gent world – that be­tween the ma­chines and es­pe­cially that be­tween hu­mans and ma­chines.

In house­holds and in al­most all in­dus­trial com­pa­nies, ser­vice and pro­duc­tion com­pa­nies: soft­ware plays a cen­tral role in al­most all areas of our daily lives. It op­ti­mizes processes, au­to­mates and en­hances pro­duc­tiv­ity. "Smart" ma­chines come along with new busi­ness as well as so­ci­ety-eco­nomic con­cepts.

Big Data and Deep Learn­ing en­sure that ma­chines are equipped with more and more knowl­edge and thus en­able them to make de­ci­sions in­de­pen­dently. They an­a­lyze their en­vi­ron­ment and the be­hav­ior of its users. With Ma­chine-Based Learn­ing the ma­chines get in­creas­ingly smarter. And once the tech­nolo­gies are im­ple­mented, even with­out much human in­ter­ven­tion.

Ma­chine-process­able knowl­edge about the things of the real world is an en­ter­prise asset across all sec­tors al­ready for a long time. Firstly, as an es­ti­mated cen­tral knowl­edge re­source with the ad­van­tages of se­man­tic search and any­time on­line avail­abil­ity. On the other hand, for en­ter­prises, also a mar­ketable eco­nomic good.

Knowl­edge is not only the basis for as­sis­tance and rec­om­men­da­tion sys­tems, but once stan­dard­ized also for fa­cil­i­tat­ing knowl­edge ex­change be­tween dif­fer­ent part­ners, clients and sup­pli­ers as well as be­tween in­dus­trial com­pa­nies among each other.

An even wider hori­zon gives us knowl­edge about processes and pro­ce­dures, the “know-how”, be­cause this al­lows the ma­chines not only to put their skills into ac­tion, so to change things or states ac­tively, but also to even op­ti­mize ac­tions ac­cord­ing to di­verse cri­te­ria au­to­mat­i­cally.

Once a ma­chine comes with mo­ti­va­tion it fi­nally acts not only as an as­sis­tant, but as a com­pe­tent au­tonomous in­stance that can even de­velop own strate­gies and team skills. Promi­nent ex­am­ples are com­pet­ing teams of soc­cer ro­bots or self-dri­ving cars, pre­sented live at the last Frank­furt In­ter­na­tional Car Fair (IAA) – based on High Per­for­mance CPUs or GPUs, but all this mainly with soft­ware.

But what im­pact has this Dig­i­tal Rev­o­lu­tion on soft­ware de­vel­op­ers? Not only re­fer­ring to the kind of soft­ware we de­velop in the fu­ture, but in par­tic­u­lar to our soft­ware de­vel­op­ment it­self?

To an­swer this ques­tion let’s first have a look at the se­man­tic web and the knowl­edge it­self, and then how it is processed and main­tained by ma­chines.

Data, Information and Knowledge

For a com­mon ter­mi­nol­ogy, let’s briefly dis­cuss the dif­fer­ence be­tween data, in­for­ma­tion and the ac­tual knowl­edge it­self (Fig.1).

Figure 1: From Characters to Competence

Ini­tially, data is only a struc­tured com­pi­la­tion of let­ters, num­bers and sym­bols, al­though in a spe­cific syn­tax, but with­out ref­er­ence. Data nei­ther fol­lows cer­tain pat­terns nor re­lates to any con­text. So pure data con­tains no dis­cernible mean­ing for ma­chines. For ex­am­ple, the num­ber 15, this can be an age, a dis­tance or a weight. Or take the word “break”. Even for us hu­mans this string is am­bigu­ous, re­lated to the type (noun or verb) as well as re­lated to the af­fected ob­jects.

Data ob­tains a mean­ing only when it is used in a cer­tain con­text in ad­di­tion to their syn­tax and if it has a spe­cific pur­pose. The fol­low­ing JSON data about the au­thor of this ar­ti­cle for ex­am­ple con­tains his age:

{ name: "Alexander Schulze",

 country: "Germany",

 age: 47,

 birthday: "01/26/1968",

 phone: "+49-2407-902486" }

Listing 1: JSON record of the author

In the con­text of the field age the num­ber 47 be­comes in­for­ma­tion. And this al­lows our soft­ware to make de­ci­sions as well as to an­swer ques­tions like who, what, where, when or how many – in a list of au­thors, for ex­am­ple, how many of them are older than 30.

To run this query, how­ever, the soft­ware needs to know that the field age im­plies a time in years. And it does that just be­cause it is pro­grammed ex­plic­itly in its source code. If the name of the field is changed in the data­base, the soft­ware can no longer work be­cause it no longer knows its mean­ing and also can­not de­tect it. More for­mally ex­pressed the soft­ware is the knowl­edge of the ma­chine rep­re­sented in code.

Al­though code an­a­lyz­ers in the field of error de­tec­tion and code op­ti­miza­tion are al­ready mak­ing promis­ing progress (for ex­am­ple, the SQE pro­ject), the knowl­edge in the source code, or even in the bi­nary code is very dif­fi­cult to ex­tract again. One could there­fore un­der­stand the act of pro­gram­ming as a uni­di­rec­tional knowl­edge trans­fer from man to ma­chine.

A key prob­lem is that the iden­ti­fiers used are ar­bi­trary, i.e. not sub­ject to any for­mal­ism, which allow a con­clu­sion on their se­man­tics. On the one hand, this means of course a cer­tain free­dom for us de­vel­op­ers – at least within the scope of the syn­tax of the re­spec­tive pro­gram­ming lan­guage. On the other hand, ex­actly this ar­bi­trari­ness makes it dif­fi­cult to com­pare and ex­change in­for­ma­tion from dif­fer­ent sources and es­pe­cially hard to in­ter­pret and to process.

So we are al­ready in the midst of a new trend, namely the in­creased in­tro­duc­tion of con­ven­tions, such as in the eCl@ss pro­ject . Here it is, among other things, about prod­uct clas­si­fi­ca­tion to fa­cil­i­tate the shar­ing of ar­ti­cles and ser­vices and their char­ac­ter­is­tics. For us web and mo­bile de­vel­op­ers par­tic­u­larly rel­e­vant for the e-com­merce sec­tor.

What we have not even touched yet: In the ex­am­ple above with the au­thor record we only imply that the num­ber 47 in re­la­tion to the age of a per­son is an in­for­ma­tion about time mea­sured in years. For food, mayflies or atomic events a num­ber more likely will be as­so­ci­ated with units like days, hours or mi­crosec­onds.

Let’s be hon­est to our­selves: How often do we find lines like the below ones in our code?

var timeout = 2000; /* timeout in milliseconds */

var expiration = 14; /* expiration in days */

var age = 47; /* age in years */

Listing 2: Value assignments without units

The unit is an es­sen­tial part of a nu­mer­i­cal in­for­ma­tion. With­out a unit, no qual­i­fied com­par­i­son for a ma­chine and thus no au­tonomous de­ci­sion is pos­si­ble.

The unit not only im­plies a nu­meric data type, but also pro­vides a mean­ing. A prop­erty A with the value 5 and the unit kg clearly de­scribes a mass – re­gard­less of the iden­ti­fier. An­other prop­erty B with the value 2000g then is com­pa­ra­ble with A, if both units kgand g are in­ter­linked with the base unit mass – pro­vided you have ap­pro­pri­ate con­ver­sion fac­tors.

The same ap­plies to iden­ti­fiers: For ex­am­ple, if for our au­thors record the birth­day field is linked with the Date class, then the sys­tem knows that the given string is a date. Is Start­Date then a sub­class of Date, then the sys­tem knows that on that date starts some­thing. Are now the re­sults of the func­tions today() and now() cross-linked with the sub­class Cur­rent­Date, then the ma­chine can de­ter­mine the age of a per­son au­tonomously – with­out that this was ex­plic­itly im­ple­mented in the pro­gram code.

So knowl­edge is cre­ated through the in­ter­con­nec­tion of in­for­ma­tion. On the one hand, knowl­edge al­lows log­i­cal con­clu­sions, and on the other hand the recog­ni­tion of con­tra­dic­tions and in­con­sis­ten­cies. So in fu­ture the qual­ity of our data and the in­tel­li­gi­bil­ity of our code will im­prove ac­cord­ingly.

Smart Knowledge Bases

Now the ques­tion arises: How are we going to man­age our fu­ture in­creas­ingly com­pre­hen­sive knowl­edge in data­bases? Knowl­edge bases al­ready dif­fer con­cep­tu­ally sig­nif­i­cantly from tra­di­tional data­bases – both re­la­tional SQL table mod­els, such as in MySQL, as well as the NoSQL ap­proach, such as in Mon­goDB with its doc­u­ments and col­lec­tions.

Knowl­edge bases fol­low the con­cept of graph data­bases. A graph con­sists of nodes, the ac­tual el­e­ments of a data­base, and edges, the con­nec­tions be­tween these el­e­ments. Then there are the prop­er­ties that allow to de­scribe both the el­e­ments them­selves as well as their con­nec­tions among each other. Data prop­er­ties de­scribe the con­crete val­ues for el­e­ments and ob­ject prop­er­ties the re­la­tion­ships be­tween the el­e­ments. All these terms are su­per­or­di­nately called re­sources.

The Se­man­tic Web is a spe­cial de­riv­a­tive of a graph data­base, built of so-called on­tolo­gies. These man­age their con­tent in state­ments, which again are rep­re­sented as triples of sub­ject, pred­i­cate and ob­ject. An ex­am­ple:

Mozilla | isManufacturerOf | Firefox

Sub­ject and pred­i­cate are al­ways re­sources, i.e. ei­ther nodes or prop­er­ties. For ob­ject prop­er­ties, the ob­ject can be a re­source, but for data prop­er­ties also be a lit­eral, i.e. a con­crete value. Ex­am­ple:

Mozilla | was_founded | 2003

The on­tolo­gies of the Se­man­tic Web have been stan­dard­ized by the World Wide Web Con­sor­tium (W3C) and are rep­re­sented in the Web On­tol­ogy Lan­guage (OWL), which in its ver­sion 2.0 mean­while is known as OWL2. A major strength of this lan­guage is its ca­pa­bil­ity to clas­sify ob­jects. In on­tolo­gies ob­jects are called in­di­vid­u­als and each in­di­vid­ual can be as­signed to one or mul­ti­ple classes. State­ments, so-called ax­ioms, could be for in­stance:

Firefox | is_a | Browser

Chrome | is_a | Browser

A sub­se­quent axiom in the on­tol­ogy, like:

Browser | supports | JavaScript

al­lows to log­i­cally con­clude the fol­low­ing two ax­ioms:

Firefox | supports | JavaScript

Chrome | supports | JavaScript

This means, al­though these two ax­ioms are not ex­plic­itly spec­i­fied in the knowl­edge base they can be queried as such.

The se­cret be­hind that is in­fer­ence. So called rea­son­ers, the in­fer­ence en­gines for an on­tol­ogy, gen­er­ate these ad­di­tional state­ments at run-time fol­low­ing log­i­cal rules. In­deed, de­pend­ing on the size and com­plex­ity of an on­tol­ogy the rea­son­ing process can take a while. How­ever, for the var­i­ous pur­poses vari­ably com­pre­hen­sive and pow­er­ful rea­son­ers are avail­able.

In this ar­ti­cle I would like to first give you an overview of new smart tech­nolo­gies and their im­pact on us as soft­ware de­vel­op­ers. In a sub­se­quent ar­ti­cle, we will go deeper into the fea­tures and the ef­fi­cient use of on­tolo­gies in real ap­pli­ca­tions. As a good start I rec­om­mend you to study the Pizza On­tol­ogy as well as the doc­u­men­ta­tion of Protégé, one of the lead­ing free on­tol­ogy main­te­nance tools.

Once ma­chines pos­sess cer­tain knowl­edge, this will sim­plify the fu­ture soft­ware de­vel­op­ment sig­nif­i­cantly. Many things that we pre­vi­ously have pro­grammed te­diously or de­fined ex­plic­itly and in­de­pen­dently, in fu­ture will be in­ferred log­i­cally with less but in­ter­linked state­ments.

Un­like tra­di­tional data­bases, knowl­edge bases are de­signed as hi­er­ar­chi­cally and mod­u­larly or­ga­nized on­tolo­gies. They con­tain not only data and in­for­ma­tion, but also ab­stract con­cepts and classes as well as con­crete facts and processes. Many on­tolo­gies are al­ready freely avail­able on the In­ter­net – some with gen­eral, oth­ers with spe­cial­ized con­tent. It is ex­pected that the port­fo­lio gets wider here and that with in­cip­i­ent stan­dard­iza­tion and cen­tral­iza­tion processes we more and more will use ex­ist­ing knowl­edge rather than to de­velop it our­selves.

The newly launched tech­nol­ogy pro­ject “Enapso” fol­lows ex­actly this trend. Based on a generic con­cept ma­chine process­able knowl­edge about hard­ware, op­er­at­ing sys­tems, plat­forms, pro­gram­ming lan­guages and their syn­tax, li­braries and their APIs, about ar­chi­tec­tures, data mod­els, processes, al­go­rithms and best prac­tices as well as about per­for­mance, re­source uti­liza­tion and safety cri­te­ria is pro­vided cen­trally in an In­ter­net por­tal. Knowl­edge, both sta­tic and such about processes and method­olo­gies, can be ex­changed and used by de­vel­op­ers in fu­ture ei­ther pub­li­cally or re­stricted to com­pa­nies or pro­jects.

Smart Requirements

So far we have dis­cussed ex­ten­sively about the ex­pected in­flu­ences of smart tech­nolo­gies to the soft­ware de­vel­op­ment it­self. Let’s go a bit back now.

After an­a­lyz­ing ex­ist­ing or de­sired busi­ness processes a soft­ware life cycle typ­i­cally be­gins with the spec­i­fi­ca­tion of re­quire­ments – per­haps with a clas­sic spec­i­fi­ca­tion or with a prod­uct back­log in an agile soft­ware de­vel­op­ment en­vi­ron­ment. But whether wa­ter­fall model or Scrum, a sig­nif­i­cant draw­back of all these spec­i­fi­ca­tions is that they all are writ­ten in nat­ural lan­guage.

This means that also the re­quire­ments only we hu­mans can un­der­stand and process. Sim­i­lar to the pro­gram code the knowl­edge of the busi­ness an­a­lysts or the sales col­leagues is uni­di­rec­tion­ally trans­ferred from man to ma­chine – and as free text, from which it is dif­fi­cult to ex­tract again or just to be processed by a ma­chine. Not to men­tion the nat­u­rally oc­cur­ring mis­un­der­stand­ings or in­com­plete­ness.

Imag­ine, your sys­tem can “un­der­stand” your re­quire­ments. For that, first of all let’s de­fine our re­quire­ments as the de­scrip­tion of the tar­get state of an ob­ject that re­acts on spe­cific events in a cer­tain way, so as an ob­ject that comes with cer­tain mod­i­fi­able prop­er­ties and a cer­tain be­hav­ior. These are the func­tional re­quire­ments. Fur­ther­more, there are other as­pects such as per­for­mance, re­source uti­liza­tion, scal­a­bil­ity or se­cu­rity, i.e. the non-func­tional re­quire­ments.

Cer­tainly, the list of all pos­si­ble prop­er­ties and ac­tions of an ob­ject is long, but it is fi­nite, namely cor­re­spond­ing to the scope of the plat­forms, pro­gram­ming lan­guages and li­braries used. Ba­si­cally it can be said that when a data­base pro­vides the knowl­edge about terms and their mean­ings, then re­quire­ments can be de­fined based on these terms and so un­der­stood by a ma­chine.

There­fore, an es­sen­tial as­pect of a smart re­quire­ment man­age­ment is to man­age the re­quire­ments no longer as free text, but within on­tolo­gies with an ap­pro­pri­ate se­man­tic sup­port. Thus, the ma­chine can then – in the way like the knowl­edge – check soft­ware re­quire­ments against con­sis­tency and in­tegrity. Po­ten­tial con­tra­dic­tions or in­com­plete­ness in the spec­i­fi­ca­tion will be un­cov­ered in the fu­ture by the ma­chine, which pre­vi­ously was dif­fi­cult and could be done only in often ex­ten­sive di­a­logues be­tween us and our cus­tomers.

In ad­di­tion, re­quire­ments in fu­ture can be com­pared to ex­ist­ing knowl­edge and thereby even com­plex fea­si­bil­ity state­ments can be made in real time. Firstly, we will be able to eas­ily ver­ify whether cer­tain re­quire­ments can be im­ple­mented at all. Con­versely, the ma­chine will iden­tify and re­port what knowl­edge needs to be sup­ple­mented in order to cre­ate a so­lu­tion for a given prob­lem.

On­tolo­gies as a basis for re­quire­ments man­age­ment will there­fore allow as­sis­tance sys­tems in order to ob­tain self-con­sis­tent, sat­is­fi­able and mea­sur­able spec­i­fi­ca­tions, an es­sen­tial as­pect for re­li­able pro­ject cal­cu­la­tions. If the un­der­ly­ing knowl­edge base then con­sid­ers com­plex­ity and de­pen­den­cies, se­cu­rity and per­for­mance as­pects, it even will be pos­si­ble to an­a­lyze the nec­es­sary re­sources as well as to make more ac­cu­rate pre­dic­tions of time and costs.

In com­pany-spe­cific knowl­edge bases cus­tom ob­ject tem­plates and de­faults can be man­aged. Even processes can be mod­eled in on­tolo­gies – for ex­am­ple, through the in­te­gra­tion of busi­ness process mod­el­ing (BPMN). This know-how makes it pos­si­ble in the fu­ture to rapidly gen­er­ate us­able re­sults even from merely coarse re­quire­ment spec­i­fi­ca­tions.

Is spec­i­fied, for ex­am­ple, that a class Ad­dress con­sists of cer­tain fields and that the term Man­ager im­plies the ca­pa­bil­i­ties to list, cre­ate, delete and up­date, al­ready with a sim­ple re­quest Ad­dress­Man­ager, with­out fur­ther de­tails, a first pro­to­type can be pro­duced. Rapid Ap­pli­ca­tion De­vel­op­ment (RAD) can­not be quicker. Of course, the op­tion to re­fine the rough re­quire­ments within agile processes, and thereby to cus­tomize the so­lu­tion and bring it to per­fec­tion, al­ways will re­main.

An­other ad­van­tage will be that change re­quests re­peat­edly ap­proached to us in the real de­vel­oper life in fu­ture can be re­viewed al­ready dur­ing their spec­i­fi­ca­tion process in real time and checked on their im­pact on all as­pects of the final prod­uct. In­creased time and costs then can be re­ported im­me­di­ately and trans­par­ently to your cus­tomer – processes, which pre­vi­ously often were dif­fi­cult to pre­dict, to mea­sure and par­tic­u­larly hard to argue. Here, in the fu­ture smart tech­nolo­gies will sup­port us mas­sively.

Over­all, a smart re­quire­ments man­age­ment will help to move strate­gic de­ci­sions – for ex­am­ple in the choice of tech­nolo­gies, ar­chi­tec­tures and method­olo­gies – more and more to the spec­i­fi­ca­tion phase of soft­ware. Thereby, co­or­di­na­tion processes get ac­cel­er­ated, risks get rec­og­nized ear­lier and the soft­ware de­vel­op­ment on the whole gets more trans­par­ent and more cost-ef­fec­tive for our cus­tomers.

Of course, this trend to­wards Smart Soft­ware De­vel­op­ment also will have an im­pact on our every­day de­vel­oper life. So it is ex­pected that a more in­tel­li­gent knowl­edge and re­quire­ments man­age­ment leads to a cer­tain po­lar­iza­tion of the de­vel­oper com­mu­nity. While one group due to the men­tioned rea­sons ded­i­cates one­self to the setup, the stan­dard­iza­tion and the cen­tral­iza­tion of de­vel­oper knowl­edge, the oth­ers want to just ben­e­fit thereof and focus on the spec­i­fi­ca­tion of busi­ness processes – so ul­ti­mately on what is the de­sired so­lu­tion, and less on how this can be achieved. But how can this work?

Smart Agents and Machine Based Learning

Knowl­edge bases will clas­sify real world ob­jects, in­clud­ing soft­ware. They de­scribe how these ob­jects are linked among each other, their de­pen­den­cies, their skills and their be­hav­ior. Processes de­scribe al­go­rithms and es­tab­lished best prac­tices, and Ma­chine-Based-Learn­ing helps the ma­chines to in­de­pen­dently gain ex­pe­ri­ence and to learn from it.

Ex­pe­ri­ence gained from mea­sure­ments against non-func­tional re­quire­ments, such as CPU and mem­ory usage, ex­e­cu­tion speed, net­work load or en­ergy con­sump­tion, as well as sim­u­la­tions or the pro­cess­ing of user feed­back lead to a con­tin­u­ous, hor­i­zon­tal and ver­ti­cal ex­pan­sion and im­prove­ment of the knowl­edge base and the processes.

Knowl­edge and ex­pe­ri­ence will ul­ti­mately allow qual­i­fied de­ci­sions on the op­ti­mal tech­nol­ogy and ar­chi­tec­ture to solve given tasks. If these tasks are for­mu­lated se­man­ti­cally they can be processed by ma­chines and con­nected to the knowl­edge. Al­ready today, au­tonomous soft­ware agents are equipped with in­tel­li­gent al­go­rithms for knowl­edge and re­quire­ments based plan­ning and strat­egy de­vel­op­ment. Mostly they co­op­er­a­tively take over re­spon­si­bil­ity for find­ing so­lu­tions (Fig. 2).

Figure 2: Smart Solutions

Cer­tainly, from an au­to­matic soft­ware gen­er­a­tion, we are still far away. In par­tic­u­lar, first the nec­es­sary knowl­edge base needs to be cre­ated for this pur­pose. But code-pars­ing tools, Big Data analy­sis, Deep Learn­ing tech­nolo­gies and Nat­ural Lan­guage Pro­cess­ing (NLP) will push this process con­tin­u­ously.

Conclusion

Even though it might still ap­pear in vi­sion­ary dis­tance, al­ready today we are work­ing with ex­tremely use­ful code as­sis­tance and op­ti­miza­tion sys­tems to im­prove se­cu­rity, per­for­mance and sta­bil­ity of our soft­ware, thus in­creas­ing our com­pet­i­tive­ness. In fu­ture, the grad­ual de­cou­pling of de­f­i­n­i­tion and im­ple­men­ta­tion processes will au­to­mate the soft­ware de­vel­op­ment fur­ther and thus ex­clude sources of er­rors, make our re­sults more trans­par­ent and im­prove its qual­ity. We should begin to pre­pare us and to ac­tively par­tic­i­pate in this trend.

To view or add a comment, sign in

More articles by Alexander Schulze

  • Cost Transparency in the Cloud

    Semantics to analyze and reduce Operational Costs in complex (Micro-)Service Environments Increasingly, we not only use…

    1 Comment

Explore content categories