Pedro Domingos

Pedro Domingos (born 1965) is a Professor Emeritus of computer science and engineering at the University of Washington. He is a researcher in machine learning known for Markov logic network enabling uncertain inference. == Education == Domingos received an undergraduate degree and Master of Science degree from Instituto Superior Técnico (IST). He moved to the University of California, Irvine, where he received a Master of Science degree followed by his PhD. == Research and career == After spending two years as an assistant professor at IST, he joined the University of Washington as an assistant professor of Computer Science and Engineering in 1999 and became a full professor in 2012. He started a machine learning research group at the hedge fund D. E. Shaw & Co. in 2018, but left in 2019. He co-founded the International Machine Learning Society. As of 2018, he was on the editorial board of Machine Learning journal. === Publications === Pedro Domingos, The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World, New York, Basic Books, 2015, ISBN 978-0-465-06570-7. Pedro Domingos, "Our Digital Doubles: AI will serve our species, not control it", Scientific American, vol. 319, no. 3 (September 2018), pp. 88–93. "AIs are like autistic savants and will remain so for the foreseeable future.... AIs lack common sense and can easily make errors that a human never would... They are also liable to take our instructions too literally, giving us precisely what we asked for instead of what we actually wanted." (p. 93.) Pedro Domingos, 2040: A Silicon Valley Satire, BookBaby, 2024, ISBN 979-8-350-96334-2. === Awards and honors === 2014: ACM SIGKDD Innovation Award. for his foundational research in data stream analysis, cost-sensitive classification, adversarial learning, and Markov logic networks, as well as applications in viral marketing and information integration. 2010: Elected an Association for the Advancement of Artificial Intelligence (AAAI) Fellow. For significant contributions to the field of machine learning and to the unification of first-order logic and probability. 2003: Sloan Fellowship 1992–1997: Fulbright Scholarship

Rademacher complexity

In computational learning theory (machine learning and theory of computation), Rademacher complexity, named after Hans Rademacher, measures richness of a class of sets with respect to a probability distribution. The concept can also be extended to real valued functions. == Definitions == === Rademacher complexity of a set === Given a set A ⊆ R m {\displaystyle A\subseteq \mathbb {R} ^{m}} , the Rademacher complexity of A is defined as follows: Rad ⁡ ( A ) := 1 m E σ [ sup a ∈ A ∑ i = 1 m σ i a i ] {\displaystyle \operatorname {Rad} (A):={\frac {1}{m}}\mathbb {E} _{\sigma }\left[\sup _{a\in A}\sum _{i=1}^{m}\sigma _{i}a_{i}\right]} where σ 1 , σ 2 , … , σ m {\displaystyle \sigma _{1},\sigma _{2},\dots ,\sigma _{m}} are independent random variables drawn from the Rademacher distribution i.e. Pr ( σ i = + 1 ) = Pr ( σ i = − 1 ) = 1 / 2 {\displaystyle \Pr(\sigma _{i}=+1)=\Pr(\sigma _{i}=-1)=1/2} for i ∈ { 1 , 2 , … , m } {\displaystyle i\in \{1,2,\dots ,m\}} , and a = ( a 1 , … , a m ) ∈ A {\displaystyle a=(a_{1},\ldots ,a_{m})\in A} . Some authors take the absolute value of the sum before taking the supremum, but if A {\displaystyle A} is symmetric this makes no difference. === Rademacher complexity of a function class === Let S = { z 1 , z 2 , … , z m } ⊆ Z {\displaystyle S=\{z_{1},z_{2},\dots ,z_{m}\}\subseteq Z} be a sample of points and consider a function class F {\displaystyle {\mathcal {F}}} of real-valued functions over Z {\displaystyle Z} . Then, the empirical Rademacher complexity of F {\displaystyle {\mathcal {F}}} given S {\displaystyle S} is defined as: Rad S ⁡ ( F ) = 1 m E σ [ sup f ∈ F | ∑ i = 1 m σ i f ( z i ) | ] {\displaystyle \operatorname {Rad} _{S}({\mathcal {F}})={\frac {1}{m}}\mathbb {E} _{\sigma }\left[\sup _{f\in {\mathcal {F}}}\left|\sum _{i=1}^{m}\sigma _{i}f(z_{i})\right|\right]} This can also be written using the previous definition: Rad S ⁡ ( F ) = Rad ⁡ ( F ∘ S ) {\displaystyle \operatorname {Rad} _{S}({\mathcal {F}})=\operatorname {Rad} ({\mathcal {F}}\circ S)} where F ∘ S {\displaystyle {\mathcal {F}}\circ S} denotes function composition, i.e.: F ∘ S := { ( f ( z 1 ) , … , f ( z m ) ) ∣ f ∈ F } {\displaystyle {\mathcal {F}}\circ S:=\{(f(z_{1}),\ldots ,f(z_{m}))\mid f\in {\mathcal {F}}\}} The worst case empirical Rademacher complexity is Rad ¯ m ( F ) = sup S = { z 1 , … , z m } Rad S ⁡ ( F ) {\displaystyle {\overline {\operatorname {Rad} }}_{m}({\mathcal {F}})=\sup _{S=\{z_{1},\dots ,z_{m}\}}\operatorname {Rad} _{S}({\mathcal {F}})} Let P {\displaystyle P} be a probability distribution over Z {\displaystyle Z} . The Rademacher complexity of the function class F {\displaystyle {\mathcal {F}}} with respect to P {\displaystyle P} for sample size m {\displaystyle m} is: Rad P , m ⁡ ( F ) := E S ∼ P m [ Rad S ⁡ ( F ) ] {\displaystyle \operatorname {Rad} _{P,m}({\mathcal {F}}):=\mathbb {E} _{S\sim P^{m}}\left[\operatorname {Rad} _{S}({\mathcal {F}})\right]} where the above expectation is taken over an identically independently distributed (i.i.d.) sample S = ( z 1 , z 2 , … , z m ) {\displaystyle S=(z_{1},z_{2},\dots ,z_{m})} generated according to P {\displaystyle P} . == Intuition == The Rademacher complexity is typically applied on a function class of models that are used for classification, with the goal of measuring their ability to classify points drawn from a probability space under arbitrary labellings. When the function class is rich enough, it contains functions that can appropriately adapt for each arrangement of labels, simulated by the random draw of σ i {\displaystyle \sigma _{i}} under the expectation, so that this quantity in the sum is maximized. The Rademacher complexity of a set A {\displaystyle A} can be rewritten as Rad ⁡ ( A ) := 1 m E σ [ sup a ∈ A ∑ i = 1 m σ i a i ] = 1 m 2 m ∑ σ ∈ { − 1 / m , + 1 / m } m [ sup a ∈ A ⟨ σ , a ⟩ ] . {\displaystyle \operatorname {Rad} (A):={\frac {1}{m}}\mathbb {E} _{\sigma }\left[\sup _{a\in A}\sum _{i=1}^{m}\sigma _{i}a_{i}\right]={\frac {1}{{\sqrt {m}}2^{m}}}\sum _{\sigma \in \{-1/{\sqrt {m}},+1/{\sqrt {m}}\}^{m}}\left[\sup _{a\in A}\langle \sigma ,a\rangle \right].} Each term in the summation is the farthest distance of the set A {\displaystyle A} from the origin, along a unit-length direction σ {\displaystyle \sigma } . The directions are along the vertices of a hypercube. Thus, we can also write it as Rad ⁡ ( A ) = 1 2 m 1 2 m − 1 ∑ σ ∈ { − 1 / m , + 1 / m } m / { − 1 , + 1 } [ sup a ∈ A ⟨ σ , a ⟩ − inf a ∈ A ⟨ σ , a ⟩ ] {\displaystyle \operatorname {Rad} (A)={\frac {1}{2{\sqrt {m}}}}{\frac {1}{2^{m-1}}}\sum _{\sigma \in \{-1/{\sqrt {m}},+1/{\sqrt {m}}\}^{m}/\{-1,+1\}}\left[\sup _{a\in A}\langle \sigma ,a\rangle -\inf _{a\in A}\langle \sigma ,a\rangle \right]} Here, the set { − 1 / m , + 1 / m } m / { − 1 , + 1 } {\displaystyle \{-1/{\sqrt {m}},+1/{\sqrt {m}}\}^{m}/\{-1,+1\}} denotes half of the vertices of a hypercube, selected so that each diagonal has exactly one vertex selected. In words, this states that 2 m Rad ⁡ ( A ) {\displaystyle 2{\sqrt {m}}\operatorname {Rad} (A)} is precisely the average width of the set A {\displaystyle A} along all diagonal directions of a hypercube. == Examples == A singleton set has 0 width in any direction, so it has Rademacher complexity 0. The set A = { ( 1 , 1 ) , ( 1 , 2 ) } ⊆ R 2 {\displaystyle A=\{(1,1),(1,2)\}\subseteq \mathbb {R} ^{2}} has average width 1 / 2 {\displaystyle 1/{\sqrt {2}}} along the two diagonal directions of the square, so it has Rademacher complexity 1 / 4 {\displaystyle 1/4} . The unit cube [ 0 , 1 ] m {\displaystyle [0,1]^{m}} has constant width m {\displaystyle {\sqrt {m}}} along the diagonal directions, so it has Rademacher complexity 1 / 2 {\displaystyle 1/2} . Similarly, the unit cross-polytope { x ∈ R m : ‖ x ‖ 1 ≤ 1 } {\displaystyle \{x\in \mathbb {R} ^{m}:\|x\|_{1}\leq 1\}} has constant width 2 / m {\displaystyle 2/{\sqrt {m}}} along the diagonal directions, so it has Rademacher complexity 1 / m {\displaystyle 1/m} . == Using the Rademacher complexity == The Rademacher complexity can be used to derive data-dependent upper-bounds on the learnability of function classes. Intuitively, a function-class with smaller Rademacher complexity is easier to learn. === Bounding the representativeness === In machine learning, it is desired to have a training set that represents the true distribution of some sample data S {\displaystyle S} . This can be quantified using the notion of representativeness. Denote by P {\displaystyle P} the probability distribution from which the samples are drawn. Denote by H {\displaystyle H} the set of hypotheses (potential classifiers) and denote by F {\displaystyle {\mathcal {F}}} the corresponding set of error functions, i.e., for every hypothesis h ∈ H {\displaystyle h\in H} , there is a function f h ∈ F {\displaystyle f_{h}\in F} , that maps each training sample (features,label) to the error of the classifier h {\displaystyle h} (note in this case hypothesis and classifier are used interchangeably). For example, in the case that h {\displaystyle h} represents a binary classifier, the error function is a 0–1 loss function, i.e. the error function f h {\displaystyle f_{h}} returns 0 if h {\displaystyle h} correctly classifies a sample and 1 else. We omit the index and write f {\displaystyle f} instead of f h {\displaystyle f_{h}} when the underlying hypothesis is irrelevant. Define: L P ( f ) := E z ∼ P [ f ( z ) ] {\displaystyle L_{P}(f):=\mathbb {E} _{z\sim P}[f(z)]} – the expected error of some error function f ∈ F {\displaystyle f\in {\mathcal {F}}} on the real distribution P {\displaystyle P} ; L S ( f ) := 1 m ∑ i = 1 m f ( z i ) {\displaystyle L_{S}(f):={1 \over m}\sum _{i=1}^{m}f(z_{i})} – the estimated error of some error function f ∈ F {\displaystyle f\in {\mathcal {F}}} on the sample S {\displaystyle S} . The representativeness of the sample S {\displaystyle S} , with respect to P {\displaystyle P} and F {\displaystyle {\mathcal {F}}} , is defined as: Rep P ⁡ ( F , S ) := sup f ∈ F ( L P ( f ) − L S ( f ) ) {\displaystyle \operatorname {Rep} _{P}({\mathcal {F}},S):=\sup _{f\in F}(L_{P}(f)-L_{S}(f))} Smaller representativeness is better, since it provides a way to avoid overfitting: it means that the true error of a classifier is not much higher than its estimated error, and so selecting a classifier that has low estimated error will ensure that the true error is also low. Note however that the concept of representativeness is relative and hence can not be compared across distinct samples. The expected representativeness of a sample can be bounded above by the Rademacher complexity of the function class: If F {\displaystyle {\mathcal {F}}} is a set of functions with range within [ 0 , 1 ] {\displaystyle [0,1]} , then Rad P , m ⁡ ( F ) − ln ⁡ 2 2 m ≤ E S ∼ P m [ Rep P ⁡ ( F , S ) ] ≤ 2 Rad P , m ⁡ ( F ) {\displaystyle \operatorname {Rad} _{P,m}({\mathcal {F}})-{\sqrt {\frac {\ln 2}{2m}}}\leq \mathbb {E} _{S\sim P^{m}}[\operatorname {Rep} _{P}({\

Content strategy

Content strategy guides the planning, development, and management of content. It is a recognized field in user experience design, and it also draws from adjacent disciplines such as information architecture, content management, business analysis, digital marketing, and technical communication. == Definitions == Content strategy has been described as planning for "the creation, publication, and governance of useful, usable content." It has also been called "a repeatable system that defines the entire editorial content development process for a website development project." In a 2007 article titled "Content Strategy: The Philosophy of Data," Rachel Lovinger describes the goal of content strategy as using "words and data to create unambiguous content that supports meaningful, interactive experiences." Here, she also provided the analogy that "content strategy is to copywriting as information architecture is to design." She encourages content strategists and collaborators to engage in early discussions about content meaning, models, and tools, to make sure strategy is integrated from the start rather than as an afterthought. The Content Strategy Alliance combines Kevin Nichols' definition with Kristina Halvorson's and defines content strategy as "getting the right content to the right user at the right time through strategic planning of content creation, delivery, and governance." == Practitioners == Content strategists are often familiar with a wide range of approaches, techniques, and tools. The perspectives that content strategists bring also depend heavily on their professional training and education. For instance, some specialize in "front-end strategy," which includes developing personas, journey mapping the user experience, aligning business strategy and user needs, developing a brand strategy, exploring different channels, and creating style guidelines and search engine optimization (SEO) guidelines. Others specialize in "back-end strategy," which includes creating content models, planning taxonomies and metadata, structuring content management systems, and building systems to support content reuse. Both roles involve addressing workflow and governance issues. Many organizations and individuals tend to confuse content strategists with editors. However, content strategy is "about more than just the written word," according to Washington State University associate professor Brett Atwood. For example, Atwood indicates that a practitioner needs to also "consider how content might be re-distributed and/or re-purposed in other channels of delivery." It has also been proposed that the content strategist performs the role of a curator. Just as a museum curator sifts through a collection of content and identifies key pieces that can be juxtaposed against each other to create meaning and spur excitement, a content strategist "must approach a business’s content as a medium that needs to be strategically selected and placed to engage the audience, convey a message, and inspire action."

Web performance

Web performance refers to the speed in which web pages are downloaded and displayed on the user's web browser. Web performance optimization (WPO), or website optimization is the field of knowledge about increasing web performance. Faster website download speeds have been shown to increase visitor retention and loyalty and user satisfaction, especially for users with slow internet connections and those on mobile devices. Web performance also leads to less data travelling across the web, which in turn lowers a website's power consumption and environmental impact. Some aspects which can affect the speed of page load include browser/server cache, image optimization, and encryption (for example SSL), which can affect the time it takes for pages to render. The performance of the web page can be improved through techniques such as multi-layered cache, light weight design of presentation layer components and asynchronous communication with server side components. == History == In the first decade or so of the web's existence, web performance improvement was focused mainly on optimizing website code and pushing hardware limitations. According to the 2002 book Web Performance Tuning by Patrick Killelea, some of the early techniques used were to use simple servlets or CGI, increase server memory, and look for packet loss and retransmission. Although these principles now comprise much of the optimized foundation of internet applications, they differ from current optimization theory in that there was much less of an attempt to improve the browser display speed. Steve Souders coined the term "web performance optimization" in 2004. At that time Souders made several predictions regarding the impact that WPO as an "emerging industry" would bring to the web, such as websites being fast by default, consolidation, web standards for performance, environmental impacts of optimization, and speed as a differentiator. One major point that Souders made in 2007 is that at least 80% of the time that it takes to download and view a website is controlled by the front-end structure. This lag time can be decreased through awareness of typical browser behavior, as well as of how HTTP works. == Optimization techniques == Web performance optimization improves user experience (UX) when visiting a website and therefore is highly desired by web designers and web developers. They employ several techniques that streamline web optimization tasks to decrease web page load times. This process is known as front end optimization (FEO) or content optimization. FEO concentrates on reducing file sizes and "minimizing the number of requests needed for a given page to load." In addition to the techniques listed below, the use of a content delivery network—a group of proxy servers spread across various locations around the globe—is an efficient delivery system that chooses a server for a specific user based on network proximity. Typically the server with the quickest response time is selected. The following techniques are commonly used web optimization tasks and are widely used by web developers: Web browsers open separate Transmission Control Protocol (TCP) connections for each Hypertext Transfer Protocol (HTTP) request submitted when downloading a web page. These requests total the number of page elements required for download. However, a browser is limited to opening only a certain number of simultaneous connections to a single host. To prevent bottlenecks, the number of individual page elements are reduced using resource consolidation whereby smaller files (such as images) are bundled together into one file. This reduces HTTP requests and the number of "round trips" required to load a web page. Web pages are constructed from code files such JavaScript and Hypertext Markup Language (HTML). As web pages grow in complexity, so do their code files and subsequently their load times. File compression can reduce code files by about 40 percent, thereby improving site responsiveness. Web Caching Optimization reduces server load, bandwidth usage and latency. CDNs use dedicated web caching software to store copies of documents passing through their system. Many website platforms, such as SiteGround, IONOS, Wix, and Hostinger, rely on global CDNs and caching technologies to deliver faster page loads across different geographical regions. Subsequent requests from the cache may be fulfilled should certain conditions apply. Web caches are located on either the client side (forward position) or web-server side (reverse position) of a CDN. Web browsers are also able to store content for re-use through the HTTP cache or web cache. Requests web browsers make are typically routed to the HTTP cache to validate if a cached response may be used to fulfill a request. If such a match is made, the response is fulfilled from the cache. This can be helpful for reducing network latency and costs associated with data-transfer. The HTTP cache is configured using request and response headers. Code minification distinguishes discrepancies between codes written by web developers and how network elements interpret code. Minification removes comments and extra spaces as well as crunch variable names in order to minimize code, decreasing files sizes by as much as 60%. In addition to caching and compression, lossy compression techniques (similar to those used with audio files) remove non-essential header information and lower original image quality on many high resolution images. These changes, such as pixel complexity or color gradations, are transparent to the end-user and do not noticeably affect perception of the image. Another technique is the replacement of raster graphics with resolution-independent vector graphics. Vector substitution is best suited for simple geometric images. Lazy loading of images and video reduces initial page load time, initial page weight, and system resource usage, all of which have positive impacts on website performance. It is used to defer initialization of an object right until the point at which it is needed. The browser loads the images in a page or post when they are needed such as when the user scrolls down the page and not all images at once, which is the default behavior, and naturally, takes more time. == HTTP/1.x and HTTP/2 == Since web browsers use multiple TCP connections for parallel user requests, congestion and browser monopolization of network resources may occur. Because HTTP/1 requests come with associated overhead, web performance is impacted by limited bandwidth and increased usage. Compared to HTTP/1, HTTP/2 is binary instead of textual is fully multiplexed instead of ordered and blocked can therefore use one connection for parallelism uses header compression to reduce overhead allows servers to "push" responses proactively into client caches Instead of a website's hosting server, CDNs are used in tandem with HTTP/2 in order to better serve the end-user with web resources such as images, JavaScript files and Cascading Style Sheet (CSS) files since a CDN's location is usually in closer proximity to the end-user. == Metrics == In recent years, several metrics have been introduced that help developers measure various aspects of the performance of their websites. In 2019, Google introduced metrics such as Time to First Byte (TTFB), First Contentful Paint (FCP), First Paint (FP), First Input Delay (FID), Cumulative Layout Shift (CLS) and Largest Contentful Paint (LCP) allow for website owner to gain insights into issues that might hurt the performance of their websites making it seem sluggish or slow to the user. Other metrics including Request Count (number of requests required to load a page), DOMContentLoaded (time when HTML document is completely loaded and parsed excluding CSS style sheets, images, etc.), Above The Fold Time (content that is visible without scrolling), Round Trip Time, number of Render Blocking Resources (such as scripts, stylesheets), Onload Time, Connection Time, Total Page Size help provide an accurate picture of latencies and slowdowns occurring at the networking level which might slow down a site. Modules to measure metrics such as TTFB, FCP, LCP, FP etc are provided with major frontend JavaScript libraries such as React, NuxtJS and Vue. Google publishes a library, the core-web-vitals library that allows for easy measurement of these metrics in frontend applications. In addition to this, Google also provides the Lighthouse, a Chrome dev-tools component and PageSpeed Insight a site that allows developers to measure and compare the performance of their website with Google's recommended minimums and maximums. In addition to this, tools such as the Network Monitor by Mozilla Firefox help provide insight into network-level slowdowns that might occur during transmission of data.

SD-WAN

A Software-Defined Wide Area Network (SD-WAN) is a wide area network that uses software-defined networking technology, such as communicating over the Internet using overlay tunnels which are encrypted when destined for internal organization locations. If standard tunnel setup and configuration messages are supported by all of the network hardware vendors, SD-WAN simplifies the management and operation of a WAN by decoupling the networking hardware from its control mechanism. This concept is similar to how software-defined networking implements virtualization technology to improve data center management and operation. In practice, proprietary protocols are used to set up and manage an SD-WAN, meaning there is no decoupling of the hardware and its control mechanism. A key application of SD-WAN is to allow companies to build higher-performance WANs using lower-cost and commercially available Internet access, enabling businesses to partially or wholly replace more expensive private WANs connection technologies such as MPLS. When SD-WAN traffic is carried over the Internet, there are no end-to-end performance guarantees. Carrier MPLS VPN WAN services are not carried as Internet traffic, but rather over carefully controlled carrier capacity, and do come with an end-to-end performance guarantee. == History == WANs were very important for the development of networking in general and for a long time one of the most important applications of networks both for military and enterprise applications. The ability to communicate data over long distances was one of the main driving factors for the development of data communications, as it made it possible to overcome the distance limitations, as well as shortening the time necessary to exchange messages with other parties. Legacy WANs allowed communication over circuits connecting two or more endpoints. Earlier networking supported point-to-point communication over a slow speed circuit, usually between two fixed locations. As networking progressed, WAN circuits became faster and more flexible. Innovations like circuit and packet switching (in the form of X.25, ATM and later Internet Protocol or Multiprotocol Label Switching) allowed communication to become more dynamic, supporting ever-growing networks. The need for strict control, security and quality of service (QOS) meant that multinational corporations were very conservative in leasing and operating their WANs. National regulations restricted the companies that could provide local service in each country, and complex arrangements were necessary to establish truly global networks. All that changed with the growth of the Internet, which permitted entities around the world to connect to each other. However, over the first years, the uncontrolled nature of the Internet was not considered adequate or safe for private corporate use. Independent of safety concerns, connectivity to the Internet became a necessity to the point where every branch required Internet access. At first, due to safety concerns, private communications were still done via WAN, and communication with other entities (including customers and partners) moved to the Internet. As the Internet grew in reach and maturity, companies started to evaluate how to leverage it for private corporate communications. During the early 2000s, application delivery over the WAN became an important topic of research and commercial innovation. Over the next decade, increasing computing power made it possible to create software-based appliances that were able to analyze traffic and make informed decisions without delays, making it possible to create large-scale overlay networks over the public Internet that could replicate all the functionality of legacy WANs, at a fraction of the cost. SD-WAN combines several networking aspects to create full-fledged private networks, with the ability to dynamically share network bandwidth across the connection points. Additional enhancements include central controllers, zero-touch provisioning, integrated analytics and on-demand circuit provisioning, with some network intelligence based in the cloud, allowing centralized policy management and security. Networking publications started using the term SD-WAN to describe this new networking trend as early as 2014. With the rapid shift to remote work as a result of lockdowns and stay at home orders during the COVID-19 pandemic, SD-WAN grew in popularity as a way of connecting remote workers. == Overview == WANs allow companies to extend their computer networks over large distances, connecting remote branch offices to data centers and to each other, and delivering applications and services required to perform business functions. Due to the physical constraints imposed by the propagation time over large distances, and the need to integrate multiple service providers to cover global geographies (often crossing nation boundaries), WANs face important operational challenges, including network congestion, packet delay variation, packet loss, and even service outages. Modern applications such as VoIP calling, videoconferencing, streaming media, and virtualized applications and desktops require low latency. Bandwidth requirements are also increasing, especially for applications featuring high-definition video. It can be expensive and difficult to expand WAN capability, with corresponding difficulties related to network management and troubleshooting. SD-WAN products are designed to address these network problems. By enhancing or even replacing traditional branch routers with virtualization appliances that can control application-level policies and offer a network overlay, less expensive consumer-grade Internet links can act more like a dedicated circuit. This simplifies the setup process for branch personnel. SD-WAN products can be physical appliances or software based only. === Components === The MEF Forum has defined an SD-WAN architecture consisting of an SD-WAN edge, SD-WAN gateway, SD-WAN controller and SD-WAN orchestrator. ==== SD-WAN edge ==== The SD-WAN edge is a physical or virtual network function that is placed at an organization's branch/regional/central office site, data center, and in public or private cloud platforms. MEF Forum has published the first SD-WAN service standard, MEF 70 which defines the fundamental characteristics of an SD-WAN service plus service requirements and attributes. ==== SD-WAN gateway ==== SD-WAN gateways provide access to the SD-WAN service in order to shorten the distance to cloud-based services or the user, and reduce service interruptions. A distributed network of gateways may be included in an SD-WAN service by the vendor or setup and maintained by the organization using the service. By sitting outside the headquarters in the cloud, the gateway also reduces headquarters traffic. ==== SD-WAN orchestrator ==== The SD-WAN orchestrator is a cloud hosted or on-premises web management tool that allows configuration, provisioning and other functions when operating an SD-WAN. It simplifies application traffic management by allowing central implementation of an organization's business policies. ==== SD-WAN controller ==== The SD-WAN controller functionality, which can be placed in the orchestrator or in an SD-WAN gateway, is used to make forwarding decisions for application flows. Application flows are IP packets that have been classified to determine their user application or grouping of applications to which they are associated. The grouping of application flows based on a common type, e.g., conferencing applications, is referred to as an Application Flow Group in MEF 70. Per MEF 70, the SD-WAN Edge classifies incoming IP packets at the SD-WAN UNI (SD-WAN user network interface), determines, via OSI Layer 2 through Layer 7 classification, which application flow the IP packets belong to, and then applies the policies to block the application flow or allow the application flows to be forwarded based on the availability of a route to the destination SD-WAN UNI on a remote SD-WAN Edge. This helps ensure that application performance meets service level agreements (SLAs). == Required characteristics == The Gartner research firm has defined an SD-WAN as having four required characteristics: The ability to support multiple connection types, such as MPLS, last mile fiber optic network or through high speed cellular networks e.g. 4G LTE and 5G wireless technologies The ability to do dynamic path selection, for load sharing and resiliency purposes A simple interface that is easy to configure and manage The ability to support VPNs, and third party services such as WAN optimization controllers, firewalls and web gateways == Features == Features of SD-WANs include resilience, quality of service (QoS), security, and performance, with flexible deployment options; simplified administration and troubleshooting; and online traffic engineering. === Resilience === A resilient SD-WAN reduces network downtime. To

Digital image processing

Digital image processing is the use of a digital computer to process digital images through an algorithm. As a subcategory or field of digital signal processing, digital image processing has many advantages over analog image processing. It allows a much wider range of algorithms to be applied to the input data and can avoid problems such as the build-up of noise and distortion during processing. Since images are defined over two dimensions (perhaps more), digital image processing may be modeled in the form of multidimensional systems. The generation and development of digital image processing are mainly affected by three factors: first, the development of computers; second, the development of mathematics (especially the creation and improvement of discrete mathematics theory); and third, the demand for a wide range of applications in environment, agriculture, military, industry and medical science has increased. == History == Many of the techniques of digital image processing, or digital picture processing as it often was called, were developed in the 1960s, at Bell Laboratories, the Jet Propulsion Laboratory, Massachusetts Institute of Technology, University of Maryland, and a few other research facilities, with application to satellite imagery, wire-photo standards conversion, medical imaging, videophone, character recognition, and photograph enhancement. The purpose of early image processing was to improve the quality of the image. In image processing, the input is a low-quality image, and the output is an image with improved quality. Common image processing includes image enhancement, restoration, encoding, and compression. The first successful application was the American Jet Propulsion Laboratory (JPL). They used image processing techniques such as geometric correction, gradation transformation, noise removal, etc. on the thousands of lunar photos sent back by the Space Detector Ranger 7 in 1964, taking into account the position of the Sun and the environment of the Moon. The impact of the successful mapping of the Moon's surface map by the computer has been a success. Later, more complex image processing was performed on the nearly 100,000 photos sent back by the spacecraft, so that the topographic map, color map and panoramic mosaic of the Moon were obtained, which achieved extraordinary results and laid a solid foundation for human landing on the Moon. The cost of processing was fairly high, however, with the computing equipment of that era. That changed in the 1970s, when digital image processing proliferated as cheaper computers and dedicated hardware became available. This led to images being processed in real-time, for some dedicated problems such as television standards conversion. As general-purpose computers became faster, they started to take over the role of dedicated hardware for all but the most specialized and computer-intensive operations. With the fast computers and signal processors available in the 2000s, digital image processing has become the most common form of image processing, and is generally used because it is not only the most versatile method, but also the cheapest. === Image sensors === The basis for modern image sensors is metal–oxide–semiconductor (MOS) technology, invented at Bell Labs between 1955 and 1960, This led to the development of digital semiconductor image sensors, including the charge-coupled device (CCD) and later the CMOS sensor. The charge-coupled device was invented by Willard S. Boyle and George E. Smith at Bell Labs in 1969. While researching MOS technology, they realized that an electric charge was the analogy of the magnetic bubble and that it could be stored on a tiny MOS capacitor. As it was fairly straightforward to fabricate a series of MOS capacitors in a row, they connected a suitable voltage to them so that the charge could be stepped along from one to the next. The CCD is a semiconductor circuit that was later used in the first digital video cameras for television broadcasting. The NMOS active-pixel sensor (APS) was invented by Olympus in Japan during the mid-1980s. This was enabled by advances in MOS semiconductor device fabrication, with MOSFET scaling reaching smaller micron and then sub-micron levels. The NMOS APS was fabricated by Tsutomu Nakamura's team at Olympus in 1985. The CMOS active-pixel sensor (CMOS sensor) was later developed by Eric Fossum's team at the NASA Jet Propulsion Laboratory in 1993. By 2007, sales of CMOS sensors had surpassed CCD sensors. MOS image sensors are widely used in optical mouse technology. The first optical mouse, invented by Richard F. Lyon at Xerox in 1980, used a 5 μm NMOS integrated circuit sensor chip. Since the first commercial optical mouse, the IntelliMouse introduced in 1999, most optical mouse devices use CMOS sensors. === Image compression === An important development in digital image compression technology was the discrete cosine transform (DCT), a lossy compression technique first proposed by Nasir Ahmed in 1972. DCT compression became the basis for JPEG, which was introduced by the Joint Photographic Experts Group in 1992. JPEG compresses images down to much smaller file sizes, and has become the most widely used image file format on the Internet. Its highly efficient DCT compression algorithm was largely responsible for the wide proliferation of digital images and digital photos, with several billion JPEG images produced every day as of 2015. Medical imaging techniques produce very large amounts of data, especially from CT, MRI and PET modalities. As a result, storage and communications of electronic image data are prohibitive without the use of compression. JPEG 2000 image compression is used by the DICOM standard for storage and transmission of medical images. The cost and feasibility of accessing large image data sets over low or various bandwidths are further addressed by use of another DICOM standard, called JPIP, to enable efficient streaming of the JPEG 2000 compressed image data. === Digital signal processor (DSP) === Electronic signal processing was revolutionized by the wide adoption of MOS technology in the 1970s. MOS integrated circuit technology was the basis for the first single-chip microprocessors and microcontrollers in the early 1970s, and then the first single-chip digital signal processor (DSP) chips in the late 1970s. DSP chips have since been widely used in digital image processing. The discrete cosine transform (DCT) image compression algorithm has been widely implemented in DSP chips, with many companies developing DSP chips based on DCT technology. DCTs are widely used for encoding, decoding, video coding, audio coding, multiplexing, control signals, signaling, analog-to-digital conversion, formatting luminance and color differences, and color formats such as YUV444 and YUV411. DCTs are also used for encoding operations such as motion estimation, motion compensation, inter-frame prediction, quantization, perceptual weighting, entropy encoding, variable encoding, and motion vectors, and decoding operations such as the inverse operation between different color formats (YIQ, YUV and RGB) for display purposes. DCTs are also commonly used for high-definition television (HDTV) encoder/decoder chips. == Tasks == Digital image processing allows the use of much more complex algorithms, and hence, can offer both more sophisticated performance at simple tasks, and the implementation of methods which would be impossible by analogue means. In particular, digital image processing is a concrete application of, and a practical technology based on: Classification Feature extraction Multi-scale signal analysis Pattern recognition Projection Some techniques that are used in digital image processing include: Anisotropic diffusion Hidden Markov models Image editing Image restoration Independent component analysis Linear filtering Neural networks Partial differential equations Pixelation Point feature matching Principal components analysis Self-organizing maps Wavelets == Digital image transformations == === Filtering === Digital filters are used to blur and sharpen digital images. Filtering can be performed by: convolution with specifically designed kernels (filter array) in the spatial domain masking specific frequency regions in the frequency (Fourier) domain The following examples show both methods: ==== Image padding in Fourier domain filtering ==== Images are typically padded before being transformed to the Fourier space, the highpass filtered images below illustrate the consequences of different padding techniques: Notice that the highpass filter shows extra edges when zero padded compared to the repeated edge padding. ==== Filtering code examples ==== MATLAB example for spatial domain highpass filtering. === Affine transformations === Affine transformations enable basic image transformations including scale, rotate, translate, mirror and shear as is shown in the following examples: To apply the affine

ISO/IEC 11801

International standard ISO/IEC 11801 Information technology — Generic cabling for customer premises specifies general-purpose telecommunication cabling systems (structured cabling) that are suitable for a wide range of applications (analog and ISDN telephony, various data communication standards, building control systems, factory automation). It is published by ISO/IEC JTC 1/SC 25/WG 3 of the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC). It covers both balanced copper cabling and optical fibre cabling. The standard was designed for use within commercial premises that may consist of either a single building or of multiple buildings on a campus. It was optimized for premises that span up to 3 km, up to 1 km2 office space, with between 50 and 50,000 persons, but can also be applied for installations outside this range. A major revision was released in November 2017, unifying requirements for commercial, home and industrial networks. == Classes and categories == The standard defines several link/channel classes and cabling categories of twisted-pair copper interconnects, which differ in the maximum frequency for which a certain channel performance is required: Class A: Up to 100 kHz using Category 1 cable and connectors Class B: Up to 1 MHz using Category 2 cable and connectors Class C: Up to 16 MHz using Category 3 cable and connectors Class D: Up to 100 MHz using Category 5e cable and connectors Class E: Up to 250 MHz using Category 6 cable and connectors Class EA: Up to 500 MHz using category 6A cable and connectors (Amendments 1 and 2 to ISO/IEC 11801, 2nd Ed.) Class F: Up to 600 MHz using Category 7 cable and connectors Class FA: Up to 1 GHz (1000 MHz) using Category 7A cable and connectors (Amendments 1 and 2 to ISO/IEC 11801, 2nd Ed.) Class BCT-B: Up to 1 GHz (1000 MHz) using with coaxial cabling for BCT applications. (ISO/IEC 11801-1, Edition 1.0 2017-11) Class I: Up to 2 GHz (2000 MHz) using Category 8.1 cable and connectors (ISO/IEC 11801-1, Edition 1.0 2017-11) Class II: Up to 2 GHz (2000 MHz) using Category 8.2 cable and connectors (ISO/IEC 11801-1, Edition 1.0 2017-11) The standard link impedance is 100 Ω. (The older 1995 version of the standard also permitted 120 Ω and 150 Ω in Classes A−C, but this was removed from the 2002 edition.) The standard defines several classes of optical fiber interconnect: OM1: Multimode, 62.5 μm core; minimum modal bandwidth of 200 MHz·km at 850 nm OM2: Multimode, 50 μm core; minimum modal bandwidth of 500 MHz·km at 850 nm OM3: Multimode, 50 μm core; minimum modal bandwidth of 2000 MHz·km at 850 nm OM4: Multimode, 50 μm core; minimum modal bandwidth of 4700 MHz·km at 850 nm OM5: Multimode, 50 μm core; minimum modal bandwidth of 4700 MHz·km at 850 nm and 2470 MHz·km at 953 nm OS1: Single-mode, maximum attenuation 1 dB/km at 1310 and 1550 nm OS1a: Single-mode, maximum attenuation 1 dB/km at 1310, 1383, and 1550 nm OS2: Single-mode, maximum attenuation 0.4 dB/km at 1310, 1383, and 1550 nm Grandfathered === OM5 === OM5 fiber is designed for wideband applications using SWDM multiplexing of 4–16 carriers (40G=4λ×10G, 100G=4λ×25G, 400G=4×4λ×25G) in the 850–953 nm range. === Category 7 === Class F channel and Category 7 cable are backward compatible with Class D/Category 5e and Class E/Category 6. Class F features even stricter specifications for crosstalk and system noise than Class E. To achieve this, shielding was added for individual wire pairs and the cable as a whole. Unshielded cables rely on the quality of the twists to protect from EMI. This involves a tight twist and carefully controlled design. Cables with individual shielding per pair such as Category 7 rely mostly on the shield and therefore have pairs with longer twists. The Category 7 cable standard was ratified in 2002, and primarily introduced to support 10 gigabit Ethernet over 100 m of copper cabling. Like the earlier standards, it contains four twisted copper wire pairs rated for transmission frequencies of up to 600 MHz. However, in 2006, Category 6A was ratified for Ethernet to allow 10 Gbit/s while still using the conventional 8P8C connector. Care is required to avoid signal degradation by mixing cable and connectors not designed for that use, however similar. Most manufacturers of active equipment and network cards have chosen to support the 8P8C for their 10 gigabit Ethernet products on copper and not GG45, ARJ45, or TERA connectors as Class F would have originally called for. Therefore, the Category 6 specification was revised to Category 6A to permit this use; products therefore require a Class EA channel (ie, Cat 6A). As of 2019, some equipment has been introduced which has connectors supporting the Class F (Category 7) channel. Note, however, that Category 7 is not recognized by the TIA/EIA. === Category 7A === Class FA (Class F Augmented) channels and Category 7A cables, introduced by ISO 11801 Edition 2 Amendment 2 (2010), are defined at frequencies up to 1000 MHz. The intent of the Class FA was to possibly support the future 40 gigabit Ethernet: 40GBASE-T. Simulation results have shown that 40 gigabit Ethernet may be possible at 50 meters and 100 gigabit Ethernet at 15 meters. In 2007, researchers at Pennsylvania State University predicted that either 32 nm or 22 nm circuits would allow for 100 gigabit Ethernet at 100 meters. However, in 2016, the IEEE 802.3bq working group ratified the amendment 3 which defines 25GBASE-T and 40GBASE-T on Category 8 cabling specified to 2000 MHz. The Class FA therefore does not support 40G Ethernet. As of 2025, there is no equipment that has connectors supporting the Class FA (Category 7A) channel. Category 7A is not recognized in TIA/EIA. === Category 8 === Category 8 was ratified by the TR43 working group under ANSI/TIA 568-C.2-1. It is defined up to 2000 MHz and only for distances up to 30 m or 36 m, depending on the patch cords used. ISO/IEC JTC 1/SC 25/WG 3 developed the equivalent standard ISO/IEC 11801-1:2017/COR 1:2018, with two options: Class I channel (Category 8.1 cable): minimum cable design U/FTP or F/UTP, fully backward compatible and interoperable with Class EA (Category 6A) using 8P8C connectors; Class II channel (Category 8.2 cable): F/FTP or S/FTP minimum, interoperable with Class FA (Category 7A) using TERA or GG45. == Abbreviations for twisted pairs == Annex E, Acronyms for balanced cables, provides a system to specify the exact construction for both unshielded and shielded balanced twisted pair cables. It uses three letters—U for unshielded, S for braided shielding, and F for foil shielding—to form a two-part abbreviation in the form of xx/xTP, where the first part specifies the type of overall cable shielding, and the second part specifies shielding for individual cable elements. Common cable types include U/UTP (unshielded cable); U/FTP (individual pair shielding without the overall screen); F/UTP, S/UTP, or SF/UTP (overall screen without individual shielding); and F/FTP, S/FTP, or SF/FTP (overall screen with individual foil shielding). == 2017 edition == In November 2017, a new edition was released by ISO/IEC JTC 1/SC 25 "Interconnection of information technology equipment" subcommittee. It is a major revision of the standard which has unified several prior standards for commercial, home, and industrial networks, as well as data centers, and defines requirements for generic cabling and distributed building networks. The new series of standards replaces the former 11801 standard and includes six parts: == Versions == ISO/IEC 11801:1995 (Ed. 1) ISO/IEC 11801:2000 (Ed. 1.1) – Edition 1, Amendment 1 ISO/IEC 11801:2002 (Ed. 2) ISO/IEC 11801:2008 (Ed. 2.1) – Edition 2, Amendment 1 ISO/IEC 11801:2010 (Ed. 2.2) – Edition 2, Amendment 2 ISO/IEC 11801-1:2017, -1:2017/Cor 1:2018, -2:2017, -3:2017, -3:2017/Amd 1:2021, -3:2017/Cor 1:2018, -4:2017, -4:2017/Cor 1:2018, -5:2017, -5:2017/Cor 1:2018, -6:2017, -6:2017/Cor 1:2018 (As of September 2023, this set is current.)