Feed aggregator
ElasticSearch Server
Continuing my previous post, second book that I was working for PacktPub as technical reviewer was "Elasticsearch Server".
A very good book for this development based on the library Lucene, with many examples and concepts in order to exploit the full potential of free text searches using ElasticSearch.
As with the book described in the previous post (Apache Solr 4 Cookbook) this book is a perfect resource used during the development of "Scotas's Push Connector" because it allows me to integrate and exploit their full potential.
Overview
- Learn the basics of ElasticSearch like data indexing, analysis, and dynamic mapping
- Query and filter ElasticSearch for more accurate and precise search results
- Learn how to monitor and manage ElasticSearch clusters and troubleshoot any problems that arise
- Chapter 1: Getting Started with ElasticSearch Cluster
- Chapter 2: Searching Your Data
- Chapter 3: Extending Your Structure and Search
- Chapter 4: Make Your Search Better
- Chapter 5: Combining Indexing, Analysis, and Search
- Chapter 6: Beyond Searching
- Chapter 7: Administrating Your Cluster
- Chapter 8: Dealing with Problems
Authors
Rafał KućRafał Kuć is a born team leader and software developer. Currently working as a Consultant and a Software Engineer at Sematext Inc, where he concentrates on open source technologies such as Apache Lucene and Solr, ElasticSearch, and Hadoop stack. He has more than 10 years of experience in various software branches, from banking software to e-commerce products. He is mainly focused on Java, but open to every tool and programming language that will make the achievement of his goal easier and faster. Rafał is also one of the founders of the solr.pl site, where he tries to share his knowledge and help people with their problems with Solr and Lucene. He is also a speaker for various conferences around the world such as Lucene Eurocon, Berlin Buzzwords, and ApacheCon. Rafał began his journey with Lucene in 2002 and it wasn't love at first sight. When he came back to Lucene later in 2003, he revised his thoughts about the framework and saw the potential in search technologies. Then Solr came and that was it. From then on, Rafał has concentrated on search technologies and data analysis. Right now Lucene, Solr, and ElasticSearch are his main points of interest. Rafał is also the author of Apache Solr 3.1 Cookbook and the update to it—Apache Solr 4 Cookbook—published by Packt Publishing.
Marek RogozińskiMarek Rogoziński is a software architect and consultant with more than 10 years of experience. His specialization concerns solutions based on open source projects such as Solr and ElasticSearch. He is also the co-funder of the solr.pl site, publishing information and tutorials about the Solr and Lucene library. He currently holds the position of Chief Technology Officer in Smartupz, the vendor of the Discourse™ social collaboration software.
Conclusion
Many developers know Lucene, but do not know the product ElasticSearch, to know or want to know this book is going to enter in the product and the potential of this.
Apache Solr 4 Cookbook
During my summer I had the chance to work for Packtpub as technical reviewer of two great books.
First "Apache Solr 4 Cookbook" is a very good book to entered into the world of free-text search integrated to any development or portal with real and practical examples using the latest version of the Apache Solr search.
Overview
- Learn how to make Apache Solr search faster, more complete, and comprehensively scalable
- Solve performance, setup, configuration, analysis, and query problems in no time
- Get to grips with, and master, the new exciting features of Apache Solr 4
Table of Contents
- Chapter 1: Apache Solr Configuration
- Chapter 2: Indexing Your Data
- Chapter 3: Analyzing Your Text Data
- Chapter 4: Querying Solr
- Chapter 5: Using the Faceting Mechanism
- Chapter 6: Improving Solr Performance
- Chapter 7: In the Cloud
- Chapter 8: Using Additional Solr Functionalities
- Chapter 9: Dealing with Problems
- Appendix: Real-life Situations
Rafał Kuć is a born team leader and software developer. Currently working as a Consultant and a Software Engineer at Sematext Inc, where he concentrates on open source technologies such as Apache Lucene and Solr, ElasticSearch, and Hadoop stack. He has more than 10 years of experience in various software branches, from banking software to e-commerce products. He is mainly focused on Java, but open to every tool and programming language that will make the achievement of his goal easier and faster. Rafał is also one of the founders of the solr.pl site, where he tries to share his knowledge and help people with their problems with Solr and Lucene. He is also a speaker for various conferences around the world such as Lucene Eurocon, Berlin Buzzwords, and ApacheCon. Rafał began his journey with Lucene in 2002 and it wasn't love at first sight. When he came back to Lucene later in 2003, he revised his thoughts about the framework and saw the potential in search technologies. Then Solr came and that was it. From then on, Rafał has concentrated on search technologies and data analysis. Right now Lucene, Solr, and ElasticSearch are his main points of interest. Rafał is also the author of Apache Solr 3.1 Cookbook and the update to it—Apache Solr 4 Cookbook—published by Packt Publishing.
ConclusionIf you are about to start a development or entered into the world of free text this book is a very good investment in time and resources, practical examples really serve to acquire new concepts in a simple and practical with minimal effort.
SCNs and Timestamps
select distinct ora_rowscn from PLAN_TABLE;
But unless you're a database, that SCN doesn't mean much. You can put things in some sort of order, but not much more.
Much better is
select sys.scn_to_timestamp(ora_rowscn) from PLAN_TABLE;
unless it gives you
ORA-08181: specified number is not a valid system change number
which is database-speak for "I can't remember exactly".
That's when you might be able to fall back on this, plugging the SCN in place of the **** :
select * from (select first_time, first_change# curr_change, lag(first_change#) over (order by first_change#) prev_change, lead(first_change#) over (order by first_change#) next_change FROM v$log_history)where **** between curr_change and next_change
It won't be exact, and it doesn't stretch back forever. But it is better than nothing.
PS. This isn't a perfect way to find when a row was really inserted/updated. It is probably at the block level, and there's 'stuff' that can happen which doesn't actually change the row but might still reset the SCN. If you're looking for perfection, you at the wrong blog :)
Big data can improve customer experience, reduce churn
As companies have come to realize that attracting and retaining customers depend on providing a personalized experience, more firms are looking to mine big data for enhanced analytic insight. By leveraging support from dba services, enterprises have the opportunity to understand consumers on a deeper level, develop more targeted strategies and thus, secure stronger and more loyal relationships.
CIO Magazine reported that Nationwide, a 90-year old insurance company with a multitude of databases and compliance requirements, has spent billions on big data initiatives for these purposes. Matt Jauchius, CMO of Nationwide, explained that these efforts are invaluable to his company.
"The ability to collect vast amounts of data on individual consumers – their consumption habits, their preferences, their interactions with the company – and then analyze those data sets for predictive behavior and proactively apply those insights … [that's] the basis of competitive advantage in the future for the CMO because you can provide a better experience," he told CIO.
Elana Anderson, vice president of IBM Enterprise Marketing Management, noted that big data projects have the potential to give marketers predictive capabilities that could result in more successful campaigns.
"Marketing has long been on a quest to get to the individual," she said, according to CIO. "Smart marketers…have been trying to get beyond the demographic for a long, long time. If you're able to address the individual at an individual level, if you're able to sense needs or meet needs before the customer is explicitly saying, 'I have a need,' that requires big data and analytics in order to get to that point. We're seeing tremendous value with uses cases around that."
More profitable relationships
Forbes revealed that successful big data initiatives can even lessen customer churn rates. The source explained that understanding this rate is critical for identifying and addressing any ineffective tactics or weak relationships. By patching these faults, businesses can retain more clients and ultimately drive profit long-term. According to Forbes, recent research from a data visualization firm found that if a customer had made just one purchase from a specific store, there is a 27 percent chance that he or she will repeat business with that company. Further, after three purchases, the customer is twice as likely to return in the future.
By analyzing big data from an ever-increasing variety of channels, including mobile devices and social media, enterprises can gain a stronger grasp on customers' needs, desires and preferences. This can eventually strengthen client loyalty and as a result, boost revenue.
RDX's business intelligence and big data experts assist customers in leveraging data contained in large data stores. For more information, please visit our Business Intelligence and Predictive Analytics pages or contact us.
New Row Delete for ADF Form (ADF Webinar Follow-Up)
Here you can download sample application - NewRowRemoveApp_v2.zip. So, the main trick is additionally to setting Immediate=true for Delete button, add ADF resetActionListener operation to the same Delete button:

Make sure Immediate=true is set for Delete button, along with ADF resetActionListener:

Add new row, force validation errors by trying to navigate to the next row:

Press Delete button to remove new row, validation errors will be ignored and row will be removed:

For ADF Form component is not enough to set Immediate=true property only, ADF resetActionListener must be added to force form refresh.
Performance Monitoring
Updated – just a quick reminder for next week; I’ll be doing a short webinar next Wednesday comparing the performance monitoring tools Oracle and SQL server provide.
I think I may have broken my record with 6 countries in 6 weeks – so I haven’t been very thorough at updating my blog recently. Just time, before I head off to Heathrow once again, to do a quick advert for the next redgate webinar that I’m doing with Grant Fritchey. This time comparing built-in performance monitoring tools. Details and Registrations at this URL.
I’ll see if I can catch up with a couple of answers while I’m in the airport lounge – but no promises, since the simple act of walking into an airport makes me feel like falling asleep.
Webinars
The webinars on “Smarter Statistics in 11g” are on tomorrow (Friday) at 2:00 pm and 6:00 pm. There’s a waiting list for the 6:00 pm event, so if you’ve signed up but can’t make it please delete your registration. (The event will be repeated on 10th June). If you want to vote a better time for me to do short webinars there’s a poll at the end of the article.
I’m about to make a serious move into online webinars, and as a warm-up exercise I’ll be doing a couple of one-hour free events on Friday 17th May.
I’ll be talking through a Powerpoint presentation called “Smarter Statistics in 11g” twice, once at 2:00 pm – 3:00 pm BST, and again at 6:00 pm BST (12:00 pm - 1:00 pm CDT) . Broadly speaking the first one is for the benefit people from the UK and eastwards, and the second is for the benefit of people from the US and westwards. This is just a trial run, of course, and if it works well I will be doing more of the same, perhaps three times per day to spread across more time zones.
John Goodhue (my O1 sponsor for the USA) is arranging all the mechanical details, and I’ll post links for registration when they become available – we’ll be using GoToWebinar as the supply mechanism, and we’ll be limiting access to 100 people (so if you do register and can’t attend, please remove yourself from the list; if you don’t manage to register for either event, you’ll get another chance later as I plan to repeat each event a few times.)
I’ll also be doing a full day paid event on 23rd May which will be my “Indexing Strategies” tutorial. This first full day event will be timed to suit the American audience – although anyone can register, of course – but we plan to have further events suited to other time zones. The URL for registration is now available – with an option to purchase a 30-day window to the recording of my “Oracle Mechanisms” presentation in Minneapolis.
Take Our PollBGOUG Spring 2013 : Day -1
It’s stupid o’clock in the morning and I’m waiting for my taxi to arrive. Considering how close Bulgaria is, it takes me a very long time to get there.
I am a mix of excited and nervous. This is my first conference this year, so all the usual insecurities are in full effect, from fear of flying to the constant nagging thoughts that perhaps I don’t know anything about Oracle and maybe I shouldn’t be on stage acting like I do.
I’m sure it will go OK and it will be nice to meet up with the gang again.
Cheers
Tim…
BGOUG Spring 2013 : Day -1 was first posted on May 16, 2013 at 4:55 am.©2012 "The ORACLE-BASE Blog". Use of this feed is for personal non-commercial use only. If you are not reading this article in your feed reader, then the site is guilty of copyright infringement.
Learning from our customers: Mortenson Construction
Today, Thursday May 16th at 1pm Eastern, 10am Pacific, we will be broadcasting a new webcast featuring a WebCenter customer, Mortenson Construction. I hope you can take the time to join us, either live or later on by viewing it on-demand. Many of know the benefits of enterprise content management (ECM) but they have taken it to another level by making sure that every person involved in a construction project has immediate access to the info they need.
OK, so some of you may be saying, isn't that what ECM is all about? Immediate access to the right content in order to make the right business decisions. Well, yes, ideally but I suspect we all know scenarios where that is not necessarily the case. For businesses that do not take the time to incorporate a centralized approach to content management and dissemination, ECM can become a great place to hide information, not use it effectively.
Mortenson Construction has done a great job of making sure that project owners, managers, designers, architects, trade partners and finance teams all have secure access into the project information they need. And best of all, they can do it from anywhere, on mobile devices, even on the job site itself. Most of us are not in the construction industry but we have seen projects underway.
You've probably seen the trailers that are placed on the job site so that managers, foreman and various craft workers can meet and discuss the latest design specifications and resolve problems as they arise. But what about the work team on the 16th floor of the high-rise being built? Do they have to take an extended break every time there is an issue to resolve and make their way back to the trailer to discuss it? Not if they work at Mortenson! They have a portable "Field Box" that is effectively a small office in a steel container. It can be moved anywhere by crane and be immediately online with access to every bit of project information.
Mortenson calls this "Project Connect" and it is a great example of how a company can take the power of content that must be securely managed within an ECM system to meet information governance and compliance requirements and get it to every one that needs it... anywhere!
We hope you will join the webcast tomorrow and hear directly from the team at Mortenson Construction about the benefits that they are realizing by using WebCenter Content as their ECM system. Maybe you can realize some of those benefits too!
Click this link to join us and register to watch this informative webcast.
How We Interact with Our Environments and Our Devices Has a Fascinating Effect on Enterprise Applications

This pithy interview gives you a glimpse into our visionary UX team. The interviewee is Jeremy Ashley (left),
vice president of Applications User Experience. His answers are crisp
and informative. The interviewer, ACE Director Debra Lilley (right), is
knowledgeable and keeps the conversation snappy.If that isn’t enough to make you want to click to the interview, here’s the elevator pitch. Ashley level sets where we are with FUSE, which is the new look and feel of Oracle applications. His team is looking closely at changes in technology, as well as how society is changing in regard to its reaction to technology. In recent years, cell phones, smaller display screens, and the iPad required his team to design even more efficient user experiences.
Now we are in another major transition phase. Products come out very quickly. Technology has become more personal. The challenge is to discover how users can integrate with their environment. He says, “It’s not really about mobile meaning a mobile device. It’s more about us being on the go and how the environment can assist and react accordingly.”
Ashley’s team is pushing the envelope, collaborating, and bringing colleagues and customers along with them. They provide common guidelines for standards. They also create usable, consistent components and flows (meta-components called design patterns) so internal developers and customers can build good usable experiences for their products.
The interview is entitled User Experience and How it Makes Applications Easier to Use. It appeared in Oracle Scene published by the Oracle User Group in the United Kingdom.
Virtual Developer Day - Java - June 19th

“Take Java to the Edge”
You know Java, now really know Java. Learn about the latest technical improvements in Java from the source. Watch informative tutorials (that you can repeat at your own pace) to improve your Java programming expertise and engage in live chat sessions with the preeminent Java experts.
Register NOW!
Join this FREE virtual event where you will learn about:
- Improved developer productivity and HTML5 applications
- Language improvements in Java SE to accelerate application development
- Features in Java that help you begin programming on a wide range of embedded devices
- Don't miss this opportunity. Register NOW!
AMERICAS/CANADA – June 19th, 2013
09:00 a.m. - 01:00 p.m. PDT
12:00 p.m. - 04:00 p.m. EDT
01:00 p.m. - 05:00 p.m. BRT
Big data success depends on IT alignment, expertise
Big data has become an increasingly critical focus for enterprises in every sector, but decision-makers are still struggling to make sense of these vast volumes of information. In order to extract value from these sources, enterprises need to seek reputable database experts to aid in capturing and analyzing both structured and unstructured data.
According to CMSWire, a major reason that big data projects fail is that departments often aren't on the same page in terms of defining the scope, objectives and technological needs of these initiatives. In order to effectively mine this information, the source asserted that business and IT groups need to be aligned on the goal of these efforts. By committing to a specific problem to solve or question to answer, firms are much better prepared to derive the right analytics. Another issue, the news source revealed, is that access to data is often too restricted, preventing certain team members from finding useful answers. This is mainly due to siloes that have been formed around sales, marketing, HR, financial and other data that has been strictly guarded and inhibits the gathering of real insight from big data. While some of these siloes are necessary for compliance, CMSWire explained, buy-in needs to come from top executives so that adequate information can be made available to those who need it.
Expertise and support
And even with emerging technologies for big data analysis, CMSWire pointed out that many of these solutions are so foreign that enterprises lack the ability to work with them to drive results. It is critical to have the right skills on board when executing these initiatives because these projects go beyond traditional analysis. CMSWire asserted that big data requires an understanding of machine learning and natural language processing, knowledge that many IT teams lack. Fortunately, firms can partner with database experts to fill this shortfall.
Lifehacker reported that Gartner Analyst Brian Burke recently emphasized the importance of acknowledging the big data skills shortage at the Gartner Enterprise Architecture Summit.
"Some of you are already wrestling with the issues of big data inside your organization," he said. "Some of you know that it's coming but aren't quite sure how to prepare for it. This challenge is about skills that aren't just about IT. They're clearly also about business… You must help your organization gain clarity so there are two sides of the house working together in unison."
By deploying third-party services for database management, monitoring and analysis, firms can start to fill that gap and drive big data success.
RDX's business intelligence and big data experts assist customers in leveraging data contained in large data stores. For more information, please visit our Business Intelligence and Predictive Analytics pages or contact us.
Subscribing to Oak Table blogs feed
I’ve seen some very good information posted in this feed which combines blog postings from many different Oracle performance experts who are part of what is called the “Oak Table”
http://www.oaktable.net/feed/blog-rss.xml
I’ve been using Internet Explorer to keep track of new posts in its “Feeds” section of the Favorites. Here is how to add the Oak Table blog feed to Internet Explorer:
Go to the URL listed above and click on “Subscribe to this feed”
Click on Subscribe button
Success! Now click on Favorites and then Feeds
For any feed in your list if you see the feed name in a darker font it means there is a new post. So, as I have time, I’ll go to my feeds and see which of the ones I’ve subscribed to have new posts. If you are looking for performance tuning information I highly recommend the Oak Table feed.
- Bobby
Our Glass Overlords Have Arrived
We ran into Floyd (@fteter) last night. His cyborg transformation is complete.
Note the serious demeanor, with Glass power comes great responsibility, or something.
Backstory, Anthony (@anthonyslai) finally got his Explorer Series Glass unit on Sunday. Funny story, its display had a few dead pixels, three actually. He counted. Google replaced the unit, so all’s well.
Anyway, Anthony has generously been allowing people to test-drive his Glasses, and boy, do they get attention. People wherever we go are curious. Noel (@noelportugal) and I each took a turn, and despite the bare feature set, they’re pretty amazing.
Anthony says he’s been wearing them non-stop since Sunday, and that he can’t live without them. Pretty strong endorsement. I know he’s been using them heavily because texts from him have the latest in gadgety signatures appended “Sent through Glass.”
Look for a post from him on his adventures soon. He and Noel are attending Google I/O this week, and I’m sure there will be lots of Glass news.
Possibly Related Posts:
- Google Glass Details Emerge
- The Week in Google
- Mix President’s Day Release: JRuby 1.1RC2 and a bunch of other stuff!
- Google Wallet’s Terms
- Even More Links
Getting Interactive Report Query
http://docs.oracle.com/cd/E37097_01/doc/doc.42/e35127/apex_ir.htm#BABEFDJE
but was confused by the statement for getting the IR Query:
DECLARE
l_report apex_ir.t_report;
l_query varchar2(32767);
BEGIN
l_report := APEX_IR.GET_REPORT (
p_page_id => 1,
p_region_id => 2505704029884282,
p_report_id => 880629800374638220);
l_query := l_report.sql_query;
for i in 1..l_report.binds.count
loop
dbms_output.put_line(i||'. '||
l_report.binds(i).name||
'='||l_report.binds(i).value);
end loop;
END;
If you run this statement, you will receive a concatenated string of binds used in for the filtering and the corresponding values and not the actual query (it is just not printed out). In addition to that, you need to combine this statement with the one for getting the last viewed report id:
DECLARE
l_report_id number;
BEGIN
l_report_id := APEX_IR.GET_LAST_VIEWED_REPORT_ID (
p_page_id => 1,
p_region_id => 2505704029884282);
END;
After talking to Patrick Wolf I realized that this statement delivers almost everything you need in order to get the complete query. I combined the two statements and created a function which you can use to get a query for any of your interactive reports including replaced binds. The function code is:
CREATE OR REPLACE FUNCTION get_report_sql (
p_app_id IN NUMBER,
p_page_id IN NUMBER,
p_all_cols IN BOOLEAN DEFAULT TRUE
)
RETURN VARCHAR2
IS
v_report_id NUMBER;
v_region_id NUMBER;
v_report apex_ir.t_report;
v_query VARCHAR2 (32767);
v_column VARCHAR2 (4000);
v_position NUMBER;
BEGIN
SELECT region_id
INTO v_region_id
FROM apex_application_page_regions
WHERE application_id = p_app_id
AND page_id = p_page_id
AND source_type = 'Interactive Report';
v_report_id :=
apex_ir.get_last_viewed_report_id (p_page_id => p_page_id,
p_region_id => v_region_id
);
v_report :=
apex_ir.get_report (p_page_id => p_page_id,
p_region_id => v_region_id,
p_report_id => v_report_id
);
v_query := v_report.sql_query;
FOR i IN 1 .. v_report.binds.COUNT
LOOP
v_query :=
REPLACE (v_query,
':' || v_report.binds (i).NAME,
'''' || v_report.binds (i).VALUE || ''''
);
END LOOP;
IF p_all_cols
THEN
FOR c IN (SELECT *
FROM apex_application_page_ir_col
WHERE application_id = p_app_id AND page_id = p_page_id
ORDER BY display_order)
LOOP
v_column := v_column || ', ' || c.column_alias;
END LOOP;
v_column := LTRIM (v_column, ', ');
v_position := INSTR (v_query, '(');
v_query := SUBSTR (v_query, v_position);
v_query := 'SELECT ' || v_column || ' FROM ' || v_query;
END IF;
RETURN v_query;
EXCEPTION
WHEN OTHERS
THEN
v_query := SQLERRM;
RETURN v_query;
END get_report_sql;
You can call this function in your application or in a PL/SQL package run from an application session like this:
DECLARE
v_sql VARCHAR2 (4000);
BEGIN
v_sql := get_report_sql (:app_id, :app_page_id, FALSE);
HTP.prn (v_sql);
END;
Setting the parameter
p_all_cols
to TRUE would export all columns used in the IR SQL.
Enjoy.
Big Data - are you the house or the played?
Getting students useful feedback from machine learning
Last month, I wrote this narrow defense of automated essay grading, hoping to clear the air on a new and controversial technology. In that post’s prolific comments section, Laura Gibbs made a comment echoing what I’ve heard from every teacher I speak to.
I am waiting for someone to show me a real example of this “useful supplement” provided by the computer that is responding to natural human language use – I understand what you want it to be, but I would contend that natural human language use is so complex (complex for a computer to apprehend) that trying to give writing mechanics feedback on spontaneously generated student writing will lead only to confusion for the students.
When we talk about machine learning being used to automatically grade writing, most people don’t know what that looks like. Because they don’t know the technology, they make it up. As far as I can tell, this is based on a combination of decades-old technology like Microsoft Word’s green grammar squiggles, clever new applications like Apple’s Siri personal assistant, and downright fiction, like Tony Stark’s snarky talking suits. What you get from this cross is a weird and incompetent artificial intelligence pointing out commas and giving students high grades for hiding the word “defenestration” in an essay.
My cofounder at LightSIDE Labs, David Adamson, taught in a high school for six years. If we were endeavoring to build something that was this unhelpful for teachers, he would have walked out a long time ago. In fact, though, David is a researcher in his own right. David’s Ph.D. research isn’t as focused on machine learning and algorithms as my own; instead, his work brings him into Pittsburgh public schools, talking with students and teachers, and putting technology where it can make a difference. In this post, rather than focus on essay evaluation and helping students with writing – which will be the subject of future posts – I’m going to explore the things he’s already doing in classrooms.
Building computers that talk to studentsDavid builds conversational agents. These agents are computer programs that sit in chatrooms for small-group discussion in class projects, looking by all appearances like a moderator or TA logged in elsewhere. They’re not human, however – they’re totally automated. They have a small library of lines that they can inject into the discussion, which can be automatically modified slightly in context. They use language technology, including machine learning as well as simpler techniques, to process what students are saying as they work together. The agent has to decide what to say and when.
Those pre-scripted lines aren’t thrown in arbitrarily. In fact, they’re descended from decades of research into education and getting classroom discussion right. This line of research is called Accountable Talk, and in fact there’s an entire course coming up on Coursera about how to use this theory productively. The whole thing is built on fairly basic principles:
First, students should be accountable to each other in a conversation. If you’re only sharing your own ideas and not building off of the ideas of others, then it’s just a bunch of people thinking alone, who happen to be in a chatroom together. You don’t get anything out of the discussion. Next, your thought process should be built off of connecting the dots, making logical conclusions, and reasoning about the connections between facts. Finally, those facts that you’re basing your decision-making on should be explicit. They should come from explicit sources and you should be able to point to them in your argument for why your beliefs are correct.
David’s agents are framed around Accountable Talk, doing what teachers know leads to a good discussion. Instead of giving students instructions or trying to evaluate whether they were right or wrong, they merely ask good questions at the right times. Agents were trained to look for places where students made a productive, substantial claim – the type of jumping-off point that Accountable Talk encourages. He never tried to correct those claims, though; he didn’t even evaluate whether they were right or wrong. He was just looking for the chance to make a difference in the discussion.
He used those automated predictions as a springboard for collaborative discussion. Agents were programmed to try to match student statements to existing facts about a specific chemistry topic. “So, let me get this right. You’re saying…” More often than not, he also programmed the agents to lean on other students for help. “[Student 2], can you repeat what [Student 1] just said, in your own words? Do you agree or disagree? Why?” Automated prompts like this leave the deep thinking to students. Instead of following computer instructions by rote, the students were being pushed into deeper discussions. Agents give the authority to students, asking them to lead and not taking on the role of a teacher and looming over them.
Sometimes computers failIn the real world, intervention to help students requires confidence that you’re giving good advice. If David’s agents always spout unhelpful nonsense, students will learn to ignore them. Perhaps worst of all, if the agent tries to reward students for information it thinks is correct, a wrong judgment means students get literally the opposite of helpful teaching. With all of this opportunity for downside, reliability seems like it would be the top priority. How can you build a system that’s useful for intervening in small groups if it makes big mistakes?
This is mostly accounted for by crafting the right feedback, designing agents that are tailored to the technology’s strengths and avoiding weaknesses. In large part this comes down to avoiding advice that’s so clear-cut that big mistakes are possible. Grammar checking and evaluations of accuracy within a sentence are doomed to fail almost from the start. If your goal with a machine learning system is to correct every mistake that every student makes, you’re going to need to be very confident, and because this is a statistics game we’re playing, that kind of technology is going to disappoint. Moreover, even when you get it right, what has a student gained by being told to fix a run-on sentence? At best, an improvement at small-scale grammar understanding. This is not going to sweep anyone off their feet.
By basing his conversational agents on the tenets of a good discussion, David was able to gain a lot of ground with what is, frankly, pretty run-of-the-mill machine learning. Whiz-bang technology is secondary to technology that does something that helps. When the system works, it skips the grammar lessons. Instead, it jumps into the conversation at just the right time to encourage students to think for themselves.
Sometimes, though, the agent misfires. When using machine learning, this is something you just have to accept. What we care about is that this doesn’t hurt students or start teaching wrong ideas. So let’s think about the cases where an agent can make a wrong decision: first, where the agent could have given feedback but didn’t, and second, where the agent gives the wrong feedback at the wrong time.
First, the easy case. Sometimes a student will say something brilliant and the agent will fail to catch it. Here, the balance of authority between agent and student matters. If students get used to the idea that the agent is a teacher, they’ll be looking for it to tell them they got every answer right. This is a danger zone for us – with statistical machine learning, we don’t want to back ourselves into a corner where the system has to make judgments every single time. Sometimes, we’ll be uncertain. If an agent misses its cue and the students expect a teacher’s authoritative judgment of correctness, the group will flounder when the agent misses a beat, or take the discussion into their own hands and leave the agent in the dust.
Let’s see what that looks like in practice. Here’s a transcript from an earlier study, when conversational agents weren’t as heavily rooted in the theory of Accountable Talk:
Tutor: You should now move on to discussing Condition C.
Tutor: [Student A], now would be a good time to ask [Student B] to build on what [Student C] is saying.
[Student B]: I’m so confused!
[Student A]: [Student B], would you like to build onto what [Student C] is saying? And me too!
Tutor: When you agree, write down your predictions on your worksheet.
This is dysfunctional; the tutor isn’t asking discussion-generating questions, and it’s pushing forward regardless of context. Focusing on the task single-mindedly doesn’t give students room to think.
By taking on the role of facilitator, though, the agent’s comments aren’t expected every time. We can use the system to chime in with an Accountable Talk question when we’re highly confident that the machine learning system is making the right prediction, but if the system stays silent, the students won’t even notice it’s missing; the agent is purely there to help scaffold learning, rather than judge correctness. When you’re augmenting teaching, rather than replacing humans entirely, you can afford to be cautious. Look at how the agent interacts when we start asking questions that are better suited to what we know we can do:
[Student D]: ok, boiling pt will go up and vdW will go up for all of them consecutively… right?
Tutor: Do you concur with [Student D]? Why, or why not?
[Student E]: hmm not necessarily
[Student F]: area goes up for each
[Student E]: would it?
[Student E]: im not sure
[Student D]: yea for sure area goes up
[Student F]: dipole increases first one
In this example excerpt the tutor didn’t give an instruction or evaluate anything that the first quoted student line. It simply asked a basic question in response to machine learning evaluating that spot as a good opening. The comments from these new agents use Accountable Talk principles, and get student groups discussing ideas.
Of course, these systems aren’t perfect. What we’re finding out, though, is that we can frame the discussion right for automated assessment by not trying to make our automated system the perfect arbiter of truth. What I’m describing isn’t a dire portrait of machines taking over the education system. It’s agents contributing meaningfully to learning by cautiously intervening when appropriate, using machine learning for educated guessing about when it’s time to get students to think more deeply. These agents are tireless and can be placed into every discussion in every online small group at all times – something a single teacher in a large class will never be able to do.
The results with these agents were clear: students learned significantly more than students who didn’t get the support. Moreover, when students were singled out and targeted by agent questioning, they participated more and led a more engaged, more assertive conversation with the other students. The agent didn’t have to give students remedial grammar instructions to be valuable; the data showed that the students took their own initiative, with the agents merely pushing them in the right direction. Machine learning didn’t have to be perfect. Instead, machine learning figured out the right places to ask questions, and worked towards making students think for themselves. This is how machine learning can help students.
For helping students, automated feedback works.We should be exercising caution with machine learning. Skeptics are right to second guess interventions from technologists who aren’t working with students. The goal is often to replace teachers, not help them, especially with the promise of tantalizingly quick cost savings. Yes – if you want to make standardized testing cheaper, machine learning works. I don’t to dismiss this entirely – we can, in fact, save schools and states a lot of money on existing standardized tests – but if that’s as far as your imagination takes you, you’re missing the point. What’s important isn’t that we can test students more, and more quickly, with less money. Focus on this: we can actually help students.
Not every student is going to get one-on-one time daily with a trained writing tutor. Many are never going to see a writing tutor individually in their entire education. For these students, machine learning is stepping in, with instant help. These systems aren’t going to make the right decision every time in every sentence. We need to know that, and we need to work with it. Rather than toss out technology promising the moon, look carefully at what it can do. Shift expectations as necessary. In David’s case, the shift was about authority. He empowered students to take up their own education, and chimed in when it saw an opportunity; it positioned the automated system as guide rather than dictator.
This goes way beyond grading, and way beyond grammar checking. Machine learning helps students when teachers aren’t there. Getting automated feedback right leads to students thinking, discussing ideas, and learning more – and that’s what matters. In my next post, I’d like to launch off from here and talk about what these lessons mean not just for discussion, but for writing. Stay tuned.
A last noteThe work I described from David is part of an extended series of more than 20 papers and journal articles from my advisor at Carnegie Mellon, Carolyn Rosé, and her students. While I won’t give a bibliography for a decade of research, some of the newest work is published as:
- “Intensification of Group Knowledge Exchange with Academically Productive Talk Agents,” in this year’s CSCL conference.
- “Enhancing Scientific Reasoning and Explanation Skills with Conversational Agents,” submitted to IEEE Transactions on Learning Technologies.
- “Towards Academically Productive Talk Supported by Conversational Agents,” in the 2012 conference on Intelligent Tutoring Systems.
I’ve asked David to watch this post’s comments section, and I’m sure he’ll be happy to directly answer any questions you have.
The post Getting students useful feedback from machine learning appeared first on e-Literate.
Grant Ronald and Susan Duncan for ADF Mobile SIG (UKOUG, London, May 21st)
Grant Ronald and Susan Duncan are well known Oracle speakers, they will host this event. I know there are few seats left, you should hurry to register.
ECEMEA Drop In and Learn - Your invitation from Oracle Support
Here’s an easy way to make sure you’re making the most of the support available under your Oracle service contract: our regular Drop In And Learn events.
Drop In And Learn sessions are short, informal, and free of charge. They allow you to talk to Oracle Support staff face to face, find out more about the support available and get answers to the questions that matter to you. Topics discussed so far have included My Oracle Support, making Auto Service Requests and using Proactive Support more effectively.
The sessions will be happening every month in your local Oracle offices and last no more than couple of hours.
Country
Date
Time
Location
Attendance *
Austria
Tues 4th June
16:00
Oracle Office, Vienna
I plan to attend
Croatia
Wed 5th June
14:00
Oracle Office, Zagreb
I plan to attend
Czech Republic
Wed 5th June
16:00
Oracle Office, Prague
I plan to attend
Wed 3rd July
16:00
Oracle Office, Prague
I plan to attend
Egypt
Wed 5th June
11:00
Oracle Office, Cairo
Wed 3rd July
11:00
Oracle Office, Cairo
Estonia
June - TBD
Oracle Office, Tallinn
July - TBD
Oracle Office, Tallinn
Greece
Wed 5th June
14:00
Oracle Office, Athens
Wed 3rd July
14:00
Oracle Office, Athens
Hungary
Wed 5th June
09:00
Oracle Office, Budapest
I plan to attend
Wed 3rd July
09:00
Oracle Office, Budapest
I plan to attend
Kazakhstan
Wed 5th June
16:00
Oracle Office, Almaty
Wed 3rd July
16:00
Oracle Office, Almaty
Kenya
Wed 5th June
09:00
Oracle Office, Nairobi
I plan to attend
Wed 3rd July
16:00
Oracle Office, Nairobi
I plan to attend
Latvia
Mon 20th May
16:00
Oracle Office, Riga
June - TBD
Oracle Office, Riga
July - TBD
Oracle Office, Riga
Lithuania
June - TBD
Oracle Office, Vilnius
July - TBD
Oracle Office, Vilnius
Nigeria
Wed 5th June
11:00
Oracle Office, Lagos
I plan to attend
Wed 3rd July
16:00
Oracle Office, Lagos
I plan to attend
Poland
Wed 5th June
09:00
Oracle Office, Warsaw
I plan to attend
Wed 3rd July
09:00
Oracle Office, Warsaw
I plan to attend
Romania
Wed 5th June
15:00
Oracle Office, Bucharest
Wed 3rd July
15:00
Oracle Office, Bucharest
Russia
Wed 5th June
15:00
Oracle Office, Moscow
Wed 10th July
15:00
Oracle Office, Moscow
Saudi Arabia
Mon 3rd June
15:00
Oracle Office, Riyadh
Mon 1st July
15:00
Oracle Office, Riyadh
Slovenia
Wed 5th June
14:00
Oracle Office, Ljubljiana
I plan to attend
Slovakia
Wed 19th June
16:00
Oracle Office, Bratislava
South Africa
Wed 5th June
15:00
Oracle Office, Johannesburg
Wed 3rd July
15:00
Oracle Office, Johannesburg
Turkey
Wed 5th June
15:00
Oracle Office, Istanbul
Wed 3rd July
15:00
Oracle Office, Ankara
UAE
Wed 3rd July
10:00
Oracle Office, Dubai
I plan to attend
Ukraine
June - TBD
Oracle Office, Kiev
July - TBD
Oracle Office, Kiev
We look forward to seeing you there.
Your local Oracle Support team.
Set the agenda
Let us know what aspects of Oracle Support you want to learn about − and we’ll put them on our agenda. Email us with your topic suggestions
Faster data move on EXADATA I
Introduction
In my work among other things I tune and tweak solutions for EXADATA. Today I’ll write about a big improvement we achieved with a process that moves data from the operational tables to the ones where the history is stored.
This will not be a technical post. While I talk about using advanced technologies, I will not discuss code or deep details of them in this post.
And yes, when I say post, I mean a series of posts. This will be too long to be a single post. I’ll break it up into an introduction and then a post on each area of improvement.
Let’s first discuss the before situation. This set of tables are logged to during the day. These log records are needed both to investigate how transactions were executed as well as to satisfy legal requirements. It is in a highly regulated industry and for good reason as mistakes could put someone’s life in danger.
In this situation the solutions were writing around 50 million log records per day to five tables. These tables all had a primary key based on a sequence and there was also referential integrity set up. This means that for the indexes, all processes were writing to the same place on disk. The lookup on the referential integrity was also looking at the same place. An attempt to remedy some of this had been made by hash partitioning the tables. The write activity was intense enough during the day that most of the logging had to be turned off as the solution otherwise was too slow. This of course has legal as well as diagnostic implications.
What’s worse is that once all that data was written, it had to be moved to another database where the history is kept. This process was even slower and the estimate for how long it would take to move one days worth of data was 16 hours. It never did run for that long as it was not allowed to run during the day, it had to start after midnight and finish before 7 am. As a result the volume built up every night until logging was turned off for a while and the move then caught up a little every night.
This series will have the following parts:
- Introduction (this post)
- Writing log records
- Moving to history tables
- Reducing storage requirements
- Wrap-up and summary
The plan is to publish one part each week. Hopefully I’ll have time to publish some more technical posts between the posts in this series.








